In-home physical frailty monitoring: relevance with respect to clinical tests

Background Frailty detection and remote monitoring are of major importance for slowing down, and/or even stopping the frailty process in home-dwelling older people. Taking the Fried’s criteria as a reference, this work aims to compare the results produced by a technological set (ARPEGE Pack) with those obtained by usual clinical tests, as well as to discuss the ability of the Pack to be used for long-run frailty remote monitoring. Methods 194 participants were given a number of geriatric tests and asked to make use of the ARPEGE technological tools as well as reference clinical tools to feed Fried’s indicators. Spearman or Pearson’s correlation coefficients were used to compare the ARPEGE results to the reference ones, depending on data statistical characteristics. Results Good correlations were obtained for measurements of weight (0.99), grip strength (0.89) and walking speed (0.79). Results are much less satisfactory for evaluation of physical activity and exhaustion (Spearman correlation coefficients 0.25 and 0.41, respectively). Conclusion Correlations regarding weight, grip strength and walking speed confirm the validity of the data produced by the ARPEGE Pack to feed Fried’s criteria. Assessing activity level and exhaustion from an abbreviated questionnaire is still questionable. However, for long-run monitoring other methods of evaluation can be explored. Beyond the quantitative results, the ARPEGE Pack has been proved to be acceptable and motivating in such a long-term frailty monitoring.


Background
Population demographics across a large number of countries in Europe and around the world are rapidly changing due to an ageing population. For example, the percentage of the population aged over 65 years in Europe is currently estimated at 18%, a figure that is expected to rise to nearly 30% by the year 2050 [1]. The percentage of the population over 80 years will rise from 4 to 11% over the same period [1]. This increase in the ageing population will have a significant economic impact, due to the healthcare cost for the very old population [2]. This is especially the case when older people gradually become sensitive to stress situations like occurrence of pathology, financial income reduction, break-up of social links etc. They are in a frailty state when they are unable to react to those situations and gradually decline towards dependence if the frailty state is not detected early enough. As a consequence, frail people are unable to achieve the usual activities of daily living. This leads to an inability to live at home without supervision/independently, hence and the need to move to a nursing home.
Detecting the onset of the frailty process early enough in home-dwelling people is of major importance to slow down or even stop the frailty process by an adapted prescription that could be an increase in physical activity, recommendations towards a better diet or a modification in drug delivery. In addition, continuous remote prescriptions follow-up could be useful to measure their effects on a variety of frailty indices and possibly modify them to improve their efficiency [3][4][5][6][7][8]. Such monitoring can be achieved using in-home technologies that are able to collect, to process and to transmit data from home-dwelling older people provided that those technologies are easily usable, socially acceptable and motivating. It has been shown (see for instance [9]) that older people accept such in-home technologies if the recipients find an improvement in, amongst others, activity and autonomy and if their privacy is respected.
Therefore, technologies for frailty detection and remote monitoring have to be designed with respect to these acceptability conditions and to produce data on the basis of reference scales that can be supplied in a quantified-self manner, or at least with the assistance of a home help or auxiliary nurses. Many of these scales have been published up. The Rockwood and Mitniski [10] model is probably the most comprehensive example (70 items), however it is difficult to apply outside a hospital environment and clearly unsuitable to self-measurement at home. Other scales (see for instance [11,12]) cover several aspects of frailty (physical, nutritional, emotional …) while being more limited than Rockwood's in the number of items, but still difficult to adapt to a home context.
Fried et al. [13] described frailty in older adults as a phenotype that could be identified through five criteria: unintentional weight loss, self-reported exhaustion, weakness, slow walking speed, and low physical activity. These criteria have been identified as predictive of health degradation in a number of situations related, for instance, to ageing, cardiovascular problems or renal insufficiency [14][15][16]. In that way, it would be of great interest to provide general practitioners with data collected from Fried's scale in a continuous manner. Beyond the fact that this scale has been widely diffused and used to assess physical frailty, scoring its items is easy to perform by means of simple technological tools, hence well adapted to a quantified-self approach. In that way we developed a technological package (the ARPEGE Pack [17]) whose objective was to associate each Fried's criterion with a wireless device which calculated the criterion value and sent it to a local receiver (typically a tablet or a smartphone) with remote communication capability.
The performance of each technological device included in the ARPEGE Pack was evaluated in a laboratory manner using reference measurements and/or involving only healthy subjects [18][19][20]. For those reasons, a validation in controlled clinical conditions with older people classified as healthy, pre-frail or frail was necessary in order to show the capability of the ARPEGE Pack to produce a proper estimation of the Fried's criteria when older people are concerned. The objective of the present work is to compare the results produced by the ARPEGE Pack with those obtained by the usual clinical tests on an ageing population and discuss the ability of that Pack to be used for long-term frailty remote monitoring.

Material
The ARPEGE Pack aims to provide the data for the five Fried's criteria by means of technological devices. The weight (then weight loss after successive measurements) is obtained by a connected bathroom scale which, in addition, includes a functionality evaluating balance quality (BQT: Balance Quality Tester [21]). In Fried's work, weakness is evaluated by means of grip strength measurements using a Jamar hand-held dynamometer. In order to make it easy and motivating to use in long-term monitoring, we developed the "Grip-ball" [19], a ball including all functionalities for measuring and wirelessly communicating the pressure exerted on it by the subject. The Grip-ball can be associated with a serious game which ensures the motivation side of the device [22]. As clinicians generally refer to force instead of pressure, a regression model has been developed to convert pressure into force [23]. In the ARPEGE study, we made use of an intermediate version of the Grip-ball where electronic and communication hardware was implemented outside the ball. Concerning walking speed, its evaluation is typically achieved in clinical practice by timing the subject walking over a specific distance (15 ft in Fried's work). The device included in the ARPEGE Pack makes use of a Doppler sensor (X-Band Doppler Motion Detector MDU 1130, Microwave Solutions Ltd., Marlowes, UK) associated with hardware for conditioning and communication needs, hidden in an object usually encountered at home (a vase in the current ARPEGE demonstrator) [20]. The two remaining Fried's criteria are assessed in the ARPEGE Pack by means of questionnaires included in the local receiver (a tablet in the present study). We had neither the time nor the wish to use the Minnesota Leisure Time Activity questionnaire extensively [24] for assessing physical activity level during the experiment, even in its short form. We preferred to reduce that assessment to one question adapted in French from that used in the SHARE Project ("How often do you engage in activities that require a low or moderate level of energy such as gardening, cleaning the car, or going for a walk?" [25]). For exhaustion evaluation, we adapted in French the two questions used by Fried et al. [13] and derived from the CES-D scale [26].

Protocol
One hundred ninety-four participants (Table 1) were recruited from three different locations: 141 patients coming for a geriatric examination at the Reims University Hospital (CHU Reims, France) and the Troyes non-University Hospital (CH Troyes, France), and 53 healthy older people participating in social activities organized by the Arcades First-line Prevention Center in Troyes (France). Inclusion lasted from 2013 to 10-01 to 2014-09-30. It was limited to an age of 70 years and over. People unable to stand up, with severe handicaps, acute pathologies or severe cognitive disorders (Mini--Mental State Examination < 10) were excluded from the experiment.
After providing socio-demographic data, participants underwent Comprehensive Geriatric Assessment, including level of dependency (using Katz activities of daily living (ADL) [27], and Lawton & Brody's instrumental activities of daily living [28]), balance disorders (using Berg's scale [29], walking difficulties (using the timed get-up and go test [30], nutritional status (using the Mini-nutritional Assessment -Short form [31], risk of developing pressure sores (using the Norton scale [32]), risk of depression (using the mini-geriatric depression scale [33]), Health-related quality of life (using the Duke Health Profile [34]), and the level of comorbidity (using the Charlson's comorbidity index [35]). In a second step subjects were asked to declare their current weight, test their maximal grip strength using the reference device, i.e. the Jamar dynamometer [36], as well as to walk 15 ft at their usual pace, the measure being the time to cross the distance. All data described above were first captured manually in a paper document and then electronically recorded on the Reims University server for further processing. Subjects were also asked to use the ARPEGE Pack to measure their weight (stepping on the bathroom scales), maximal grip strength (using the Grip-ball prototype), and walking speed (walking 15 ft at usual pace in the radar direction), then to answer the questions related to activity level and exhaustion (multiple choice on the tablet). All tests were achieved under the supervision of a clinical investigator. Sufficient periods of rest were preserved between successive tests involving a physical activity.

Data processing
As the objective is to compare the criterion values produced by clinical evaluation and by the ARPEGE Pack, a linear regression model was calculated between quantitative data under the condition that they were Gaussian. Normality was verified for weight, grip strength and walking speed using the Kolmogorov-Smirnov statistical test (threshold p = 0.05). The scores produced by the questionnaires being obviously non-Gaussian, the comparison between clinical tests and ARPEGE questionnaires was achieved by means of the Spearman correlation coefficient. The good agreements between the two approaches was assessed using the Cohen's kappa test. All statistical tests were performed using the XLSTAT statistical package (XLSTAT Version 2015.6.01.23953, Addinsoft, Paris, France) and the Statistical Package for Social Sciences (SPSS Inc., Chicago, IL, USA). P values less than 0.05 were considered to be statistically significant.

Weight: Bathroom scale measurement vs. declaration
The bathroom scales had been calibrated before starting the experiment session, the only result to be shown is the relationship between the weight as declared by the subject and the value produced by the scale for the same Table show both the mean and the standard deviation of the main items evaluated subject (Fig. 1). Both outcomes are obviously highly correlated, with a small intercept value due to the fact that the subjects were not asked to get undressed for the test.
In addition, when excluding the two obvious outliers in Fig. 1, the coefficient of determination R 2 increases to 0.993. These outliers were probably caused by an error in transcription from the paper document to the server.
Grip-strength: Grip-ball vs. Jamar As the Grip-ball sensor produces a value of pressure P T corresponding to the sum of the pressure P A exerted by the subject and the atmospheric pressure P I , it was necessary to convert the results from pressure to force before comparing Grip-ball data with the reference values produced by the Jamar. The conversion model used in the present study can be found in [19]: F = (4.855 × 10 -3 P I -2.020 × 10 -2 )P A with force expressed in kg and pressure in (kPa) Results are displayed in Fig. 2. The slope is underestimated (0.89 instead of 1 expected) and the intercept close to 2 kg. The coefficient of determination R 2 is equal to 0.80, i.e. Pearson correlation coefficient equal to 0.89. If we set intercept to 0, the slope increases to 0.967 without a significant decrease in the coefficient of determination (from 0.800 to 0.795).
Walking speed: Radar vs. stopwatch Figure 3 shows the relationship between values produced by the Radar device and those obtained by timing subjects over the distance advocated by Fried et al. (15 ft). Notice that the two experiments were not achieved simultaneously, so that subjects were asked to walk two times with the same instruction of "at usual pace". When forcing the intercept to 0, the slope becomes 0.94 and the coefficient of determination decreases from 0.62 to 0.59, i.e. Pearson correlation coefficients of 0.79 and 0.77, respectively.
Exhaustion and activity level: ARPEGE vs. partial score of clinical questionnaires As we did not have a Gold standard to measure activity level and exhaustion, we extracted the results produced by the few items more or less related either to activity or to a depression state from Duke Health Profile and mini-GDS.
Activity level compared to Duke partial score (items 8, 9 and 16) We selected 3 items roughly related to a physical activity: -Item 8. "Today would you have any physical trouble or difficulty when walking up a flight of stairs" -Item 9. "Today would you have any physical trouble or difficulty when running the length of a football field" -Item 16. "During the past week, how often did you take part in social, religious, or recreation activities?" We obtained r = 0.250 for Pearson's correlation coefficient computed between ARPEGE activity score (one question, see § 2.1) and the sum of Duke items 8, 9 and 16. Even though significant, correlation is very poor, expressing the obvious lack of correspondence between both questionnaires. Pearson's correlation value (r = 0.414) was a little better than those obtained for activity level but remained low if the objective is to replace the one by the other. As a comparison, correlations were also calculated between ARPEGE score and mini-GDS (r = 0.340) and between mini-GDS and the selected items of the Duke Profile (r = 0.680).

Discussion
The ARPEGE Pack has been designed for long-term follow-up of frail people at their usual place of residence. It can be used either by older people themselves in a quantified-self manner or by home care professionals (daily care or home help), depending on the person's dependency level. It is intended for assessing older people's frailty level and monitoring the effect of prescriptions on health and well-being. This implies providing older people themselves and healthcare professionals with information related to the trend of the frailty indicators in real time. This means that data produced at a specific day is less of interest than the trend evaluated over a week or even a month. Though individual measurements are locally displayed on the tablet or the Smartphone (except for the walking speed, the measurement of which is totally blind to the user), they are automatically sent to a remote server where they are synthetized, typically into a moving average on one week, and displayed as trend curves on a Web site accessible by login and password to the users themselves and possibly the clinician and/or general practitioner. Health professionals are especially involved when monitoring has been proposed to follow the improvement in a person's health state further to a medical prescription (adapted physical activity, diet recommendations, modification in drug prescription etc.).
All devices included in the ARPEGE Pack have been designed to make them acceptable (even attractive) for long-term monitoring. No one would be surprised to see scales in a bathroom, and weight measurement made by the BQT does not need any specific protocol except the need to wait for the weight to be displayed on the scale screen before getting down off the scales. Communication between the scales and the local receiver (tablet or Smartphone) does not need any action from the user. On the other hand, the Grip-ball requires the user to manipulate it in order to produce a series of grip strength values day after day. Such a requirement could be demotivating over the long term if there is no attractive way to practice. This is the reason why we suggest associating the Grip-ball with a serious game [22], the score being transmitted to the local receiver only when the player identifies as the monitored person.
Relevance of the grip strength value produced by the Grip-ball has to be evaluated from Fig. 2. From the coefficient of determination (0.80), the corresponding Pearson's correlation coefficient (0.89) is equivalent to the result of Desrosiers et al. [37] who found a Pearson's correlation coefficient of 0.89 or 0.90 depending on the hand used.
The Radar device has to be put in a location where it is possible to measure a walking speed on a distance of 2.5 to 3 m (the maximum distance to ensure a proper detection by the currently available device). Such a distance can be usually found in most homes, even though we must recognize that it could be a limitation. Social acceptation of the device has been taken into account by including sensor and associated electronics in an object currently encountered at home (specifically a vase, Fig. 4). In its first version, the prototype was designed to be permanently activated. We are about to upgrade to a second version, including a presence detector, in order to produce and communicate a walking speed value to the local receiver only if there is someone walking in its Fig. 4 Prototype of the Radar device and its inclusion in a vase direction. In addition tests are scheduled in Living Lab in order to determine how to select only the values deduced from older people's genuine walk. This is achievable due to the fact that the sensor produces an instantaneous speed value. Therefore it should be possible to classify walk and non walk movements by analysing the signal profile produced by the sensor.
In terms of speed values, comparison between Radar and stopwatch measurements leads to a coefficient of determination of 0.62, decreasing to 0.59 when forcing the intercept to 0. Hence a Pearson's correlation coefficient of 0.79 and 0.77, respectively. Even though these results sound satisfactory, discrepancies between both measurement methods would have two main origins. The first is investigator dependent timing: defining precisely the crossing of start and stop lines manually could lead to differences in results from one trial to another, especially for high speeds, i.e. low timing values. The second relates to the intra-subject variability when achieving the two trials separately, which was the case in the present work. When asking subjects to walk at a comfortable gait speed, Bohannon [38] found a high correlation (0.903) between two successive trials. However, the people included were healthy individuals between 20 and 79 years of age whose physical shape and capacity were quite different from those of our population. In a previous work [20], we evaluated in the same way (Radar and stopwatch) the walking speed of twenty three young and healthy subjects, asked to walk 10 times down a corridor over a maximum distance of 10 m at three self-determined speeds (slow, usual, fast). The resulting correlation between Radar and stopwatch measurements was around 0.90, slightly depending on the trial conditions. This favors an assumption of higher test-to-test variability for older people.
Cohen's kappa coefficient was calculated to determine the agreement between the two approaches, with (k) = 0.7613; (k) showed good agreements between the two approaches (ARPEGE VS. FRIED's SCALE). The number of subjects classified as frail and non-frail for each of the two approaches is shown in the Table 2.
Results related to physical activity and exhaustion sound the more disappointing. However, they need to be carefully interpreted due to the fact that the clinical items used as references do not correspond to a "Gold Standard" with respect to those included in the ARPEGE Pack. On one hand, exhaustion assessment was made by a French adaptation of the two questions used by Fried et al. [13] and derived from the CES-D scale [26,39]. We should point out that CES-D does not reflect what could be produced by the mini-GDS clinical test. On the other hand, we did not (and did not want to) dispose of a relevant but time-consuming clinical tool like the Minnesota LTA questionnaire dedicated to physical activity assessment. This obviously constitutes a handicap in our attempt to evaluate the relevance of the SHARE single question related to physical activity. Therefore the results obtained when comparing answers produced either by the SHARE question or a Duke partial score are far from being satisfactory, given that we did not compare equivalent questionnaires. However, putting the work in context, the ARPEGE Pack's main purpose is a long-run monitoring of frailty evolution over time. In that case the ARPEGE Pack could include some activity sensors [40,41] such as those commercially on the market or even included in Smartphones.
Finally, we propose two hypotheses to explain why we have only 8 people classified as frail ( Table 2). The first hypothesis concern the effects of subjects' age, 194 participants were recruited from three different locations as it presented in the protocol section. It was limited to an age of 70 years and over, the age of subjects are distributed in percentage as: (66% of subjects are aged < 80 years; 80 years < 34% of subjects are aged < 90 years). In this study, we noticed that six subjects among the eight classified frail, are aged greater than 80 years. Authors take this result into account before any new future recruitment could take place, for example: we can reverse the recruitment process (i.e. 66% > 80 years; 34% < 80 years). The second hypothesis more probable than the first, in our opinion, this might be due to the strict cut-offs used in Fried conditions, especially for grip-strength and walking speed. In fact, the Fried cut-offs, were based on US community data from the CHS (Cardiovascular Health Study). In addition, it may be useful to highlight what we observed about the difference in the average heights for example. There was a gap of about 2.6 cm for men and 2.4 cm for women between our subjects and Fried's group. By adjusting the Fried cut-off for these two new values, we observed that 7.27% of our subjects passed from Not-frail to Frail. Hence, it may be preferable to associate data from several European geriatric centers in order to develop particular cut-offs for European populations.

Conclusion
Remote monitoring of physical the frailty state and its evolution is part of the great challenge aiming at allowing older people to live at home while preserving their autonomy and well-being. There are two main conditions that have to be followed by such a monitoring system in order to be efficient over time. The first condition is the relevance of the data produced by the sensors with respect to reference clinical tests. The ARPEGE Pack roughly fits this condition, especially for the Fried's indicators objectively measured by technological tools (weight, grip strength, walking speed). The second condition relates to acceptability and motivation. When people are requested to achieve a specific action day by day, this action must be either designed to provide pleasure in doing it, which is the case for the Grip-ball associated with a serious game, or a normal part of everyday life, like using the bathroom scales. When sensors do not require any voluntary action from the user (case of the Radar device) they have to be hidden in socially acceptable objects, i.e. usually encountered at home like the vase proposed in the ARPEGE Pack. The next step will be to test the ARPEGE Pack in real conditions at home. Such an experiment is currently being set up in association with a teleservice platform and with the active participation of the general practitioners who usually follow the patients included in the experiment.