This cross-sectional study examined the DEMMI’s psychometric properties in a convenience sample of sub-acute geriatric inpatients treated in a geriatric rehabilitation hospital in Bochum, Germany. Recruitment was initialized by the hospital physiotherapists, who were aware of the inclusion and exclusion criteria and who reported potentially eligible patients to the research coordinators. These patients were then screened for eligibility and invited to participate. The study was approved by the Ethical Review board of the German Confederation of Physiotherapy (registration number: 2012–05). All included participants provided written informed consent.
Relative and absolute inter-rater reliability were examined between two physiotherapists with 5 and 7 years of work experience (T.B. and J.R.), respectively. Both assessors were familiar with the DEMMI as they discussed the test instructions and did some pilot-measures in five geriatric patients each prior to the reliability study. Both assessors independently performed the DEMMI in a sample of geriatric inpatients. Both DEMMI measures were performed within 30 minutes and a 10-minute rest was given between the assessments. This was done to create a stable test-retest situation. In a random order, each assessor was the first assessor in half the patients. Both physiotherapists were blind to the results of the other. The test conditions were similar for both measurements with respect to the environment (patient’s room).
Participants were a sample of convenience, that is, inpatients in a German geriatric hospital who were eligible on three randomly selected recruiting days during a period of 3 weeks. Participants were excluded if they had severe dysphasia, documented contraindications to mobilizations or severe cognitive impairment. Patients isolated for infection and to whom death was imminent were also excluded. The presence of any of these exclusion criteria was pre-defined by clinical judgement of the treating physiotherapists and if needed in consultation with the ward physician.
The sample size approximation was based on an inter-rater reliability estimate for the DEMMI of r = 0.87 between two physiotherapist in the sub-acute hospital setting found by others . Following the method presented by Bonett , given 2 raters, a planning value of ICC = 0.87 and a desired 95% confidence interval (CI) with the width of 0.20, a minimum sample size of 29 participants was needed.
The DEMMI’s validity was examined in a sample of geriatric rehabilitation inpatients. Exclusion criteria were the same as for the reproducibility sample, with Mini Mental State Examination (MMSE) scores <21 points and age <60 years. Written informed consent, socio-demographic variables, MMSE, the age adjusted Charlson Comorbidity Index (CCI) and Falls Efficacy Scale International (FES-I) were collected in a first session by a physiotherapist or undergraduate research assistants. In a second session, the DEMMI and other performance based measures of mobility and ambulation (POMA, FAC, 6MWT, gait speed) were performed by one of three different well experienced physiotherapists (T.B., J.R. and a third assessor with 8 years of work-experience) in a standardized order, starting with the DEMMI in each session. All assessors were trained in the administration of the outcome measures.
The independent reliability and validity data samples were pooled in order to enlarge the data sample size for subsequent Rasch analysis.
Measures of mobility
The DEMMI consists of 15 items . The patient is asked to perform mobility tasks in several positions (bed, chair, stand, walk), which the examiner rates on 2- or 3-point response options, resulting in a maximum ordinal score of 19 points. A conversion table allows for transformation of the raw score into a total interval DEMMI score, which ranges from 0 to 100 points, with higher scores indicating a higher level of mobility. The DEMMI has a hierarchical structure, and thus each assessed individual can be located on the 101 point mobility spectrum. The DEMMI form consists of one paper sheet, with the items printed on one side and the instruction protocol on the other, which makes it easy to use in clinical practice [20,28]. The German DEMMI and a German instruction handbook can both be downloaded free of charge (www.hs-gesundheit.de).
The POMA is a clinician-observed measure of mobility and fall risk, consisting of 2 sub-scales (balance and gait) [10,31]. A maximal total score of 28 points can be reached, with higher scores indicating higher mobility functions. Although results of reproducibility are inconclusive [15,16,32], it is considered to be a valid measure of older people’s fall risk and mobility [10,31,32].
The clinician-completed FAC rates the level of independence and functional ambulation over a walking distance of 10 meters on a 6-point ordinal scale [11,33,34]. Lower scores, where physical assistance is needed, indicate poorer mobility than higher scores, where the patient is able to ambulate independently.
For the 6MWT , the test subject is asked to walk along a plain walkway for 6 minutes. The distance in meters is measured, with longer distances indicating a better walking capacity and higher velocity. Walking aids were allowed and breaks were offered if needed. The 6MWT is a reliable and valid instrument to quantify mobility and walking endurance in older individuals [35,36]. In non-ambulatory participants, the 6MWT was scored as 0 meters.
Comfortable gait speed was assessed over a distance of 10 meters . The time measurement started after a gait initiation phase of some steps  and participants were allowed to use their usual walking aid. Distance and time were measured with a measuring wheel and a stop-watch, respectively. In order to reduce burden on the participants, measurements were taken during the 6MWT performance. Gait speed can be measured reliably and it is a valid measure of mobility and health status of older people [37,39]. Participants who could not ambulate without physical assistance, or those who needed >90 seconds, were scored as non-ambulatory.
The FES-I is one of the most commonly-used measures of fear of falling [40,41]. The person is asked to rate his or her concerns regarding falling while performing several ADL situations on a 4-point Likert-scale (“not at all concerned” to “very concerned”). Most questions deal with concerns in mobility activities (such as getting in or out of a chair, walking around in the neighbourhood, walking on an uneven surface). Scores range from 16 to 64 points, with higher values representing more concerns in fall-prone situations. As there is a strong correlation between fall risk, ambulation and mobility [42,43], a German version of the FES-I was administered by interview as a reproducible and valid self-reported instrument for construct validity analysis [40,44,45].
Data were analysed using SPSS 21.0 for all analyses except for the Rasch analysis, which was completed using RUMM2030. Descriptive statistics were used to present sample characteristics. Interval-based data were examined for normal distribution with the Shapiro-Wilk test of normality and by visual inspection of the related histograms and p-p-plots. A P-value <5% indicated statistical significance in all performed analysis.
Inter-rater reliability was examined using the intra-class correlation coefficient (ICC) model 2.1 (two-way random effects model) [46,47]. Type of disease, as potential confounding factor, was analysed by a visual scatter plot inspection. A uniform distribution of points without formation of disease groups (ICD-10 categories: musculoskeletal, circulatory, respiratory, nervous system or digestive, based on the primary diagnosis given by the ward physician) would indicate no confounding by the factor “type of disease”.
The minimal detectable change (MDC) with 90% confidence, a quantification of absolute agreement, was calculated as √2 x standard error of measurement (SEM), multiplied by 1.64. The SEM was calculated as the pooled standard deviation (SD) x √(1-ICC). MDC90 is defined as the minimal amount of change that needs to occur between repeated assessments in an individual to exceed, with 90% confidence, the error of the measurement . The method of Bland and Altman was used to illustrate agreement between the two raters . Differences between raters were plotted against their mean score. Thus, points scatter around a horizontal mean difference line, which should be close to zero within the upper and lower 95% limits of agreement (ie, mean difference ±1.96 SD of the difference). Cronbach’s alpha, a measure of internal consistency, was derived from the validity sample due to the larger sample size .
Convergent, discriminant and known-groups validity were examined as different aspects of the DEMMI’s construct validity. Correlations between the DEMMI and other functional measures were calculated with Spearman’s correlation coefficient rho (ordinal) and Pearson’s correlation coefficient r (interval) together with the appropriate 95% CIs . We hypothesized that the DEMMI would show a very strong (≥.80) correlation with a multi-component mobility scale (POMA) and a strong (≥.70) correlation with outcome measures of ambulation alone (FAC, gait speed, 6MWT). The FES-I is a patient reported measure of fear of falling during performance of ADLs, a construct considered to be related to mobility-perceptions, but not as strongly as outcome measures of performance of ambulation. Therefore, we hypothesized a negative moderate correlation (−0.50 to −0.69) between DEMMI and FES-I scores. The hypothesis with respect to discriminant validity was a non-significant, low correlation between DEMMI scores and measures of comorbidity and cognition (CCI and MMSE, respectively).
For known-groups validity, we hypothesized that participants ambulating without a walking aid would have significantly higher DEMMI scores than participants using a walking aid (Mann–Whitney U test, P < 0.05). The difference between the mean scores of both groups was assumed to exceed the minimal clinical important difference (MCID) of 10 DEMMI points reported for the Australian English DEMMI version . Furthermore, three groups with respect to the self-reported level of dependence in in-hospital ambulation were defined (non-ambulatory, ambulatory with assistance and independent ambulation). It was hypothesized that mean DEMMI scores would be higher in participants mobile with assistance than in non-ambulatory ones, and that independent ones had the highest scores. Mean group differences, which were hypothesized to be larger than 10 DEMMI points, were investigated by the use of a Kruskal-Wallis test with post hoc analysis between groups (Mann–Whitney U test with corrected P < 0.017) .
The English DEMMI version was developed based on the Rasch model  in 106 Australian older acute medical patients (81.2 ± 7.3 years of age, 47% female) . Data fitted the model in various conditions such as patients with hip fracture, older acute medical patients and older patients with knee or hip osteoarthritis [20,27,53]. The Rasch model is a probabilistic model that asserts that item response is a logistic function of item difficulty and person ability . Rasch analysis was conducted in this study to complete the cross-cultural validation process for the German version of the DEMMI.
Overall fit to the model was evident if item trait interaction chi-square P was greater than 0.05 and item fit was indicated by fit residuals less than ±2.5 and a non-significant Bonferroni adjusted Chi-square P value. Local independence of items is an assumption of the Rasch model. Local dependence occurs when the response to one item is dependent on the response to another and can inflate the apparent internal consistency of the scale. The assumption of local independence of items was checked by identifying any items with person-item residual correlations larger than 0.2. A subtest analysis using the correlated items was then undertaken to determine whether the internal consistency (Personal separation index and Cronbach alpha) of the whole item set was higher than for the subtest. The assumption of unidimensionality (all items reflecting a single underlying latent trait) was tested by creating subsets of items with the most different loadings on the residual principal components analysis. Paired t-tests were conducted on the estimates of person abilities generated using the item subsets and fewer than 5% of cases with significantly different scores (P < 0.05) indicates a unidimensional scale .
Differential Item Functioning (DIF) is a form of item bias that occurs when persons of the same ability perform differently on an item based on another variable. In this study, DIF for the DEMMI was investigated for age (<80 years and 80+ years), gender and age-adjusted CCI score (0–6 and 7+). A target sample size of at least 100 up to 144 was set for this study to provide 95% confidence within ±0.5 logits .