The COACH prompting system to assist older adults with dementia through handwashing: An efficacy study

Mihailidis, Alex; Boger, Jennifer N; Craig, Tammy; Hoey, Jesse

doi:10.1186/1471-2318-8-28

Research article
Open access
Published: 07 November 2008

The COACH prompting system to assist older adults with dementia through handwashing: An efficacy study

Alex Mihailidis^1,2,
Jennifer N Boger^1,2,
Tammy Craig¹ &
…
Jesse Hoey³

BMC Geriatrics volume 8, Article number: 28 (2008) Cite this article

26k Accesses
199 Citations
79 Altmetric
Metrics details

Abstract

Background

Many older adults with dementia require constant assistance from a caregiver when completing activities of daily living (ADL). This study examines the efficacy of a computerized device intended to assist people with dementia through ADL, while reducing caregiver burden. The device, called COACH, uses artificial intelligence to autonomously guide an older adult with dementia through the ADL using audio and/or audio-video prompts.

Methods

Six older adults with moderate-to-severe dementia participated in this study. Handwashing was chosen as the target ADL. A single subject research design was used with two alternating baseline (COACH not used) and intervention (COACH used) phases. The data were analyzed to investigate the impact of COACH on the participants' independence and caregiver burden as well as COACH's overall performance for the activity of handwashing.

Results

Participants with moderate-level dementia were able to complete an average of 11% more handwashing steps independently and required 60% fewer interactions with a human caregiver when COACH was in use. Four of the participants achieved complete or very close to complete independence. Interestingly, participants' MMSE scores did not appear to robustly coincide with handwashing performance and/or responsiveness to COACH; other idiosyncrasies of each individual seem to play a stronger role. While the majority (78%) of COACH's actions were considered clinically correct, areas for improvement were identified.

Conclusion

The COACH system shows promise as a tool to help support older adults with moderate-levels of dementia and their caregivers. These findings reinforce the need for flexibility and dynamic personalization in devices designed to assist older adults with dementia. After addressing identified improvements, the authors plan to run clinical trials with a sample of community-dwelling older adults and caregivers.

Peer Review reports

Background

Globally, the number of individuals aged 65 years and older is predicted to increase steadily, particularly among the oldest old (aged 80 years and over) after the year 2010 [1]. This will result in an increase in the worldwide number of individuals diagnosed with dementia, particularly Alzheimer's Disease (AD), from the current estimate of 24.3 million individuals in 2006 to 81.1 million by 2040 [2].

Older adults have a strong preference for aging-in-place (i.e. remaining in their own homes and communities) compared to other forms of care, such as nursing homes and other long-term care facilities [3]. Additionally, various studies have implied that older adults (particularly those who have AD) benefit from aging in environments to which they are accustomed as familiar environments can provide memory and task cues [4–6]. However, this shift from the hospital to home-based care means that family members and other informal caregivers are being increasingly depended upon to attend to the long-term health-care needs of older adults with AD. Increased dependence and changes in the relationship dynamic are difficult for both people with AD and their family caregivers to accept [7]. The constant pressure to meet their relative's needs for assistance and support can result in debilitating levels of stress for the caregiver, resulting in the affected person's placement into long-term care. From a caregiver's perspective, decreasing the number of interactions required to complete an activity of daily living (ADL) has a direct positive impact on caregiver burden. Even small decreases in caregiver burden have been found to alleviate the prevalence of depressive symptoms in caregivers of individuals with AD [8]. This can lead to more successful informal care, resulting in lower medical costs and delayed long-term care placements.

To support aging-in-place, older adults and their caregivers are increasingly relying on the use of computerized Cognitive Assistive Technologies (CATs) to complete ADL [6]. Often coupled with some form of artificial intelligence (AI), CATs strive to support cognitive disorders thereby enhancing the user's autonomy [9]. The maintenance or increase of independence is coupled with a reduction in the levels of caregiver assistance, and likely caregiver burden, as well as a decrease in home heath care costs [10].

A significant amount of recent work in CATs for assisting people with cognitive impairments use probabilistic models to infer task and occupant status from sensors distributed throughout a person's living environment [11]. For example, Autominder, developed by Pollack et al. [12], uses artificial intelligence planning to schedule events such as medication taking around a person's daily schedule, such as favorite television programs or daily walks. Autominder uses environmental sensors to detect the status of activities, and if required, will provide the user with context-aware reminders regarding unattended activities. The Gator Tech Smart House is an example of a smart home designed with older adults in mind. Sensors distributed throughout the house interact with applications running on computers to take into account context when performing actions. For example, if it is a sunny day outside and the resident has the television on, the Gator Tech Smart House [13] will automatically close the blinds to reduce glare. Other features include medication reminders that can appear on the bathroom mirror and automatic sensing and ordering for soap and toilet paper refills. Pigot et al. [14] developed Archipel, a cognitive modeling system for cooking tasks that recognizes the user's intended plan and adapts prompting to a pre-determined cognitive impairment level. Sensors, such as RFID tags and readers, in the kitchen environment detect which objects have been used and provide cues (audio, video and strategic lighting) to help users through each step in the task. As with Autominder, Archipel will not give reminders for tasks the user has already accomplished.

Research is increasingly emphasizing the importance of maintaining functional independence in older adults as a way of maintaining good health and wellness among older adults with dementia, while simultaneously reducing medical expenditures [15, 16]. However, the extent to which CATs can aid an individual with AD depends on the users' willingness to implement it, which in turn depends on whether the individual and/or his/her caregiver can operate the device, feels that the device is useful, and whether the device supports or undermines the sense of personal identity [17]. To be useful to both a care recipient with dementia and his/her caregiver(s), a CAT must be autonomous, non-invasive, and must not require explicit feedback (e.g. button presses), as this cannot reasonably be expected of either people with AD or overworked caregivers. Cognitive assistance should be able to accommodate high levels of customization as the more the assistance is personalized and appropriate to the deficits in question, the more likely it will be adhered to and understood by the user [18]. Finally, assistance should only be given on an "as needed" basis to minimize confusion and to keep the user as cognitively involved in the task as possible.

The majority of currently available CATs require extensive sensor deployment and maintenance and/or input from a cognitively intact individual. Most likely the caregiver of the individual with dementia would have to learn how to operate and (to some degree) maintain a potentially complex planning system. As many caregivers are overburdened as it is, two goals of the system described in this paper were to minimize the amount of hardware that was needed, and to have the system function without any explicit input from the user or the caregiver.

The result was the COACH (Cognitive Orthosis for Assisting aCtivities in the Home), a system that employs various computer vision and artificial intelligence techniques to autonomously provide the user with verbal and/or visual reminders as necessary during ADL. Table 1 summarizes the progression of the systems used in the previous versions of COACH. The systems in each version of COACH represent significant advances in the sophistication and versatility compared to those used in the previous version. The systems for the latest version of COACH (Version 3 in Table 1) are described in more detail in the Methods section below.

Table 1 Summary of previous COACH systems.

Full size table

This paper presents results from an eight-week efficacy study of the COACH with older adults with dementia. Methods and results are presented, followed by a discussion regarding the potential clinical significance of the participants' and device performances. While a brief description of the technology will be provided in this paper, the reader is referred to [19] for an in-depth description of the COACH system and algorithms.

Objective

The objective of this study was to answer the following research questions:

1.
Is the COACH system able to guide an older adult with dementia through the handwashing ADL with less dependence on a caregiver? If dependence decreases it should be reflected in an increase of the number of steps in the handwashing activity the older adult is able to complete independently from a caregiver (i.e. with no assistance from the caregiver).
2.
Does this new version of COACH reduce caregiver workload? If the caregiver's workload is reduced, this should be reflected in a decrease in the number of times a caregiver interacts with his/her care recipient.
3.
How will the COACH system perform with respect to its ability to correctly provide assistance to the user throughout the ADL? To achieve a positive outcome, the system must be able to follow the older adult through the handwashing task, autonomously giving the correct prompt if (and only if) they are needed.

Methods

Device (COACH) design

In this work the authors extend upon the two previous versions of the COACH device (summarized in Table 1), which both focused on the activity of handwashing [20–22]. Handwashing was chosen as the model ADL because it is a relatively safe activity that older adults with dementia have difficulties completing because of the required planning and initiation skills.

Handwashing was defined as having five essential steps that must be accomplished for successful activity completion, which are depicted in Figure 1. COACH guided users through these steps using four integrated components: the tracking system, belief monitoring system, policy, and prompting system, as represented in Figure 2.

Images captured by a video camera are processed by the tracking system and the hand and towel positions are passed to the belief monitoring system. These data are used by the belief monitoring system to compute the belief state; a probabilistic estimation of the current state of the user and environment. The belief state is passed from the belief monitoring system to the policy, which is essentially a lookup table that denotes the best course of action for the system to take for every state that could be received from the belief monitor. Each belief state that is received from the belief monitoring system is translated by the policy into an action for COACH to take. Possible actions available to the COACH are to give a low-guidance verbal prompt, give a high-guidance verbal prompt, give an verbal prompt with a video demonstration of the action, call the caregiver to intervene, or to do nothing (i.e., continue to observe the user). The COACH's different levels of prompting assistance give COACH the ability to select the most appropriate support for each individuals' stage of AD and overall responsiveness. Thus the level of detail played for the user is based on factors such as the error committed, sensory and cognitive status of the user, and past responsiveness to the previous prompts.

The COACH system presented above had three significant changes from the previous versions: 1) The use of markerless flocking to track the activity; 2) the use of a partially observable Markov decision process (POMDP) to model the handwashing guidance problem; and 3) the refinement of audio prompts and the addition of video demonstrations. Tracking was accomplished using a computer vision technique known as flocking, which was developed by Hoey et al. [23]. It uses models of skin and towel color combined with a Bayesian sequential estimation technique. This method of tracking is quite robust and able to dependably track the location of the user's hands and the position of the towel, even after occlusion by an object or after leaving and returning to the camera's field of view. A POMDP was chosen as the basis for the new planning system because of this model's ability to make good decisions in situations of uncertainty, as well as making intelligent inferences, and therefore decisions, about unobservable states (e.g. a user's level of dementia) [24]. This type of model allows the COACH system to autonomously tailor itself to the individual needs of its users because it can estimate and use individual's traits (e.g. cognitive awareness and responsiveness levels) to dynamically adapt to daily and long term needs. Implementation of a POMDP is an important contribution to not only the field of artificial intelligence but to the usability concerns of users and their caregivers as it enables user-specific prompting strategies while remaining autonomous. Greater details regarding the technical nature of COACH, including system detailed descriptions and planning algorithms, can be found in [19].

Audio prompts were recorded using a professional male actor to emulate the cadence and tone of a professional caregiver. A male voice was used (as opposed to a female one) because previous research by this group and others suggests that male voices are easier to hear and understand, possibly because the male voice has a lower pitch/frequency [22]. The wording used for the prompts is shown in Table 2 and was similar to the wording used in previous studies, modified slightly according to recommendations from Wilson et al. [18]. Prompts included the participant's name at the beginning of each prompt to get his/her attention. Previous studies with COACH have found that some users can get confused about which activity they were asked to complete (e.g. previous participants have been known to wash the towel in the sink, wash his/her face, etc.), therefore the prompt often contained a reminder to help participants remember which activity they were attempting to complete. The prompt then gave the participant guidance for the step in the activity s/he was attempting. The potential usefulness of adding video demonstrating correct completion of the activity step was examined by Labelle and Mihailidis [25]. Results were positive; therefore audio-video capabilities were added to this version of COACH. The videos used in this study were shot from the perspective of the participant. They were pre-recorded in the same washroom that was used in the trials and combined with the maximal assistance verbal prompts. A frame from one of the videos is shown on the monitor in Figure 3b.

Table 2 Wording for the prompts used by COACH.

Full size table

Participants and Ethics/Consent Process

This study was reviewed and approved by the Toronto Rehabilitation Institute's Research Ethics Board (REB). Potential participants were identified by the staff at the long-term care (LTC) facility in Toronto, Canada where the study took place. Informed consent to participate was obtained in writing (using a the consent form approved by the REB) from the participants' substitute decision makers, after the study was described to them using an information sheet and informal interview.

Participants had to meet the following inclusion/exclusion criteria: over the age of 65 years of age, no history of violence, fluent in English, can hear normal levels of speech, exhibit no severe motor impairments, and have moderate-to-severe dementia. Level of dementia was determined through the administration of the Mini-Mental State Examination (MMSE), an assessment instrument that is commonly used to estimate the level of cognitive impairment in adults [26]. Typically, participants are separated into four categories of impairment based on his/her MMSE score: no impairment (30–26 points), mild (25–20 points), moderate (19–10 points), and severe (9-0 points). Each participant's dementia level was scored using the MMSE before the start and upon completion of the trials

Apparatus set-up

The study was conducted in a retrofitted washroom and adjoining office that were dedicated to the project by the LTC facility. The washroom was fitted with a ceiling-mounted IEEE-1394 digital video camera (Point Grey Research DragonFly2), and a wall-mounted 21-inch LCD screen and desktop speakers (Figure 3a). A Dell Latitude laptop computer (2 GHz processor, 2 Gb RAM) was used as the processing unit for the system software and hardware, as well as the operator graphical user interface to display information about the system variables (e.g. estimated plan steps, system response, etc.), and the participant's progress through the task. The trials were also recorded using a camcorder positioned above the participant to capture video for post-trial evaluation by human raters (but was not used by the system during the trials).

Study design

A single subject research design (SSRD) was used in this study because of the difficulty in recruiting and maintaining an adequate sample size and the variability of the participants' health [27–31]. This research design has been used in the authors' previous studies and has been found to be the most appropriate procedure for the evaluation of this type of technology. The study consisted of two baseline phases, A₁ and A₂ (COACH not used), and two intervention phases, B₁ and B₂, (COACH used), run in the order A₁-B₁-A₂-B₂ to identify any carry-over effects. Based on studies completed with previous versions of COACH, 10 trials per phase were deemed to be sufficient for participant performance to stabilize and for the desired changes in the dependent variables to be observed [22].

Procedure

Trials consisted of one trial per day per participant, Mondays to Fridays, for eight weeks for a total of 40 trials each. To ensure uniformity and avoid any potential risk of injury from falls, each participant was required to sit in a wheelchair and was taken to the test washroom by a caregiver who was hired for this study. The caregiver positioned the participant in front of the sink in the test washroom and asked the participant to wash his/her hands.

During the A-phases of the trials, the caregiver interacted with the participants as she normally would, providing any prompts and reminders she felt were necessary to complete handwashing. During the B-phases, COACH was started by a researcher (who was hidden from the user) as soon as the caregiver requested the participant to wash his/her hands. The caregiver then left the participant alone in the test washroom and discreetly observed him/her from the hallway. The caregiver provided assistance only if instructed to do so by COACH (i.e., the caregiver was summoned by the device to intervene) or if the caregiver felt the need to intervene for the well being of the participant (e.g., the participant was attempting to stand up from the wheelchair or was becoming upset).

Data collection tools

A score sheet was used to collect data required to evaluate the system's efficacy in terms of both user and system performance. The score sheet was the same one that was developed and used in studies that examined previous versions of the COACH (refer to [22]).

With respect to user performance, scales on the score sheet measured the following for each step of the activity: 1) independent step completion; 2) number of caregiver interactions; and 3) functional assessment scale (FAS). Independent step completion was scored for every trial. Participants scored one point for the first time s/he completed a step in a trial without assistance of any kind from a human caregiver. As there were five essential steps (Figure 1) the maximum score that could be attained was five, even if a participant completed more than five steps independently. For example, a participant could independently turned the water on, wet her hands, turned off the water, got some soap, turned the water on again, rinsed her hands, turned off the water, and finally dried her hands. Although none of the actions in this sequence would be technically incorrect, the participant would still score a five on independent step completion. Number of caregiver interactions was a count of the number of times the caregiver had to interact with the participant to get him/her to complete a step. An interaction was considered to be any exchange between the caregiver and the participant that was related to activity completion, including verbal prompting, demonstration, and touching (either the participant or an object). The functional assessment score (FAS) is a modified version of the Functional Independence Measure (FIM™), which is a standardized assessment tool used to measure one's ability to function with independence over 18 activities of daily living [32]. Participants received an FAS for each step in the activity and scores ranged from zero (no attempt/refusal) to seven (complete independence), with an overall maximum of 35. If the participant completed the step in response to prompts provided by the COACH, a score of seven was given. A higher cumulative FAS is expected to correlate with higher levels of activity completion independence. The face validity of the FAS was demonstrated in previous trials by Mihailidis et al. [22, 33].

During the B-phases, data were collected regarding the system responses to participant performance during the handwashing activity. These data were collected based on the basic principles of signal detection theory (SDT) [34], which can be used to measure four conditions describing device performance with respect to: hits, false alarms, correct rejects, and misses. These conditions with respect to the COACH system are outlined in Table 3. For each step in the activity, the system was rated as having at least one, and potentially more, of the four possible SDT conditions. For example, if the COACH gave three incorrect prompts and one correct prompt for a step, three false alarms and one hit would be scored.

Table 3 The four possible conditions used to determine COACH's performance.

Full size table

Analysis of participant and device performance

Video of each trial was reviewed and scored by an experienced rater using a multi-modal score sheet to collect the types of data described previously. An experienced rater was a researcher who was trained on the scoring methods and has had previous experience rating COACH trials. Space on each score sheet was provided for any general comments or observations.

Because of the small number of participants in the study, visual analyses of the data were used to identify trends of participant behaviors and abilities, and compare changes in variability between phases. Visual analysis is a commonly used technique for single-subject research designs. Data were examined for all trials and overall trends of in-group performance between baseline (A) and intervention (B) phases, as well as for variations in participant performance. Observed participant behaviors and reactions were used to aid in the analysis of the results.

Analyses of the device performance data were achieved through the calculation of the number of hits, misses, false alarms, and correct rejects (described in Table 3) made by the system during the intervention (B) phases. These data were also used to calculate two types of error: E_w (Equation 1) which reflects COACH not detecting an error when participants made one, thus not giving a prompt, and E_c (Equation 2) which reflects COACH detecting an error when none occurred, thus erroneously giving a prompt. These equations were derived by Mihailidis [33] for the analysis of previous research on COACH.

E_{W} = \frac{M i s s e s}{H i t s + M i s s e s} \times 100

(1)

E_{C} = \frac{F a l s e A l a r m s}{F a l s e A l a r m s + C o r r e c t R e j e c t s} \times 100

(2)

Results

Inter-rater agreement

To ensure data reliability, a second experienced rater scored 20 percent of all data collected regarding participant performance and an inter-rater agreement was calculated using Cohen's Kappa (using SPPS v15.0) [27]. The measures of agreement (K values) were K = 0.96 (p < 0.0005) for independent step completion, K = 0.69 (p < 0.0005) for number of caregiver interactions, and K = 0.63 (p < 0.0005) for FAS.

Participants

Eight participants were recruited for this study, however two were withdrawn; S2 developed unrelated health problems, and S7's aggressive behavior caused concerns for the wellbeing of both herself and the study caregiver. Demographics for the remaining six participants are presented in Table 4. Based on his/her initial MMSE scores, five participants (S3, S4, S5, S6 and S8) were classified as having moderate-level dementia, and one participant (S1) was classified as having severe-level dementia.

Table 4 Demographics of the participants.

Full size table

Participant performance

As S1 was the only participant in the severe-level group and noticeably different trends from the other participants, this sub-section examines the moderate-level participants (S3, S4, S5, S6 and S8) as a group. Table 5 summarizes overall individual participant performance per test phase, which shows improvements in all three areas, particularly in a reduction in the number of interactions with the caregiver. From Table 6 it can be seen that four of the five participants were able to independently complete the activity. Table 7 shows the overall number of interactions with the caregiver required by the participant to successfully complete essential handwashing steps, which decreased by an average of 66% when the device was introduced. Table 8 shows the participants' FAS for the handwashing activity increased by a negligible 2% for the group. Figures 4 to 6 depict the daily average performance for the entire moderate-level participant group (n = 5) for the number of steps completed independently, the number of interactions with a caregiver, and FAS respectively.

Table 5 Average participant performance for each trial phase and overall group performance.

Full size table

Table 6 Average number of steps per trial completed independently without (Phase A) and with (Phase B) COACH

Full size table

Table 7 Average number of interactions with the caregiver per trial without (Phase A) and with (Phase B) COACH

Full size table

Table 8 Average participant FAS scores per trial without (Phase A) and with (Phase B) COACH

Full size table

Device performance

A summary of the data regarding COACH performance is presented in Table 9. It should be noted that the item Participant ignored prompt from COACH in Table 9 represents the combined number of both ignored hits and ignored false alarms. The error rates, E_w and E_c (described by Equations 1 and 2), were found to be 10.9% and 26.0% respectively. This can be interpreted as COACH not responding to 10.9% of the errors made by participants (E_w) and COACH making an error in 26% of the cases where the participant was completing the step correctly (E_c).

Table 9 Device performance with regards to COACH's response to participants' actions and participants' reactions to prompts given by COACH.

Full size table