- Research article
- Open Access
- Open Peer Review
Novel use of the Nintendo Wii board as a measure of reaction time: a study of reproducibility in older and younger adults
BMC Geriatrics volume 15, Article number: 80 (2015)
Reaction time (RT) has been associated with falls in older adults, but is not routinely tested in clinical practice. A simple, portable, inexpensive and reliable method for measuring RT is desirable for clinical settings. We therefore developed a custom software, which utilizes the portable and low-cost standard Nintendo Wii board (NWB) to record RT. The aims in the study were to (1) explore if the test could differentiate old and young adults, and (2) to study learning effects between test-sessions, and (3) to examine reproducibility.
A young (n = 25, age 20–35 years, mean BMI of 22.6) and an old (n = 25, age ≥65 years, mean BMI of 26.3) study-population were enrolled in this within- and between-day reproducibility study. A standard NWB was used along with the custom software to obtain RT from participants in milliseconds. A mixed effect model was initially used to explore systematic differences associated with age, and test-session. Reproducibility was then expressed by Intraclass Correlation Coefficients (ICC), Coefficient of Variance (CV), and Typical Error (TE).
The RT tests was able to differentiate the old group from the young group in both the upper extremity test (p < 0.001; −170.7 ms (95%CI −209.4;-132.0)) and the lower extremity test (p < 0.001; −224.3 ms (95%CI −274.6;-173.9)). Moreover, the mixed effect model showed no significant learning effect between sessions with exception of the lower extremity test between session one and three for the young group (−35,5 ms; 4.6 %; p = 0.02). A good within- and between-day reproducibility (ICC: 0.76-0.87; CV: 8.5-12.9; TE: 45.7-95.1 ms) was achieved for both the upper and lower extremity test with the fastest of three trials in both groups.
A low-cost and portable reaction test utilizing a standard Nintendo wii board showed good reproducibility, no or little systematic learning effects across test-sessions, and could differentiate between young and older adults in both upper and lower extremity tests.
Fall accidents within the rapidly growing ageing population  is a major concern worldwide due to serious consequences such as increased morbidity, decreased functional levels, admission to long-term care facilities, and higher mortality [2, 3]. Many reports have associated fall accidents in older adults with an increased reaction time (RT) in either the upper or lower extremities [4–9] but is not routinely tested in clinical practice. Lord et al.  performed a prospective study on 341 community-dwelling women (+65 years of age) and found a strong association between fallers and increased lower limb RT compared to the non-fallers. Another prospective study  found that upper and lower-extremity RT along with other physiological, cognitive and medical factors could discriminate between fallers and nonfallers. These studies along with others have paved the way for the physiological profile approach , which uses both upper and lower extremity RT when determining the fall risk of older adults. Finally, several researchers have found RT measures to be responsive to exercise interventions in older adults [11–13], making the RT measures relevant in clinical practice. A common protocol for assessing RT is to measure the time between the presentation of a light stimulus and subsequently hitting a response button  testing either the upper , the lower  or both extremities . RT has been tested extensively in both athletic- (i.e. soccer , racecar , lacrosse ) and nonathletic populations (i.e. single fallers vs. recurrent fallers , community-dwelling older adults , sport science students ), and various age groups [16, 21–23]. In most cases these studies have used expensive laboratory equipment [8, 16–21], which prevents wide application of RT testing. Occasionally a fast, simple and inexpensive method for measuring RT has been tested, but subsequently reported non-reliability . This highlights the need for a reliable, inexpensive, widely available, and portable system for evaluating RT. The Nintendo Wii Board (NWB) could satisfy these needs as it measures 50 cm × 30 cm × 5 cm, weighs 4.5 kg, has a price tag around 100$, and currently over 43 million copies of the NWB have been sold worldwide. Moreover, the NWB has currently been used in other scientific studies along with “off the shelf” software for exercise interventions [24, 25], evaluation of postural balance , prediction of falls , and custom software for postural balance evaluation  in children.
However, to our knowledge no previous studies have explored the NWB for evaluating upper- and lower-extremity RT. Thus, we developed a software application, which utilizes the force transducers of the NWB to record RT in the both the upper and lower extremities. Thus, the aim of this study was (1) to explore if RT could differentiate between older and younger adults, and (2) to determine if a learning effect existed between test-sessions, and (3) finally to examine reproducibility of the RT test.
We recruited participants in two age groups. Participants in the younger group (20–35 years of age) were recruited from the campus of Aalborg University, Denmark and participants in the older group (≥65 years of age) from senior citizen clubs and organizations in the Aalborg area (Table 1). In both groups participants were excluded if they had any acute illness within the previous 3 weeks, neurological disease (such as dementia, Parkinson), visual impairment (Snellen score <3/60), were taking medication (psychotropic, hypnotics or anti-depressive) that could influence RT, or had orthopedic surgery within the previous 6 months.
The study was designed, performed, and analyzed according to guidelines for reporting reliability and agreement studies (GRRAS) . A within- and between-day design was applied using a single rater (intra-rater). Within-day reproducibility was explored using a one-hour break between sessions, and between-day reproducibility tested with 1–7 days between sessions (Fig. 1).
To collect data from the NWB a laptop computer (Lenovo Yoga Pro, Windows 8) was connected using a bluetooth HID wireless protocol and imported into a custom program written in C#. The evaluation of the RT in the lower (FOOT) and upper (HAND) extremity was carried out by performing a series of step- and hammering-tests on the NWB. In the stepping test, (Fig. 2a) participants were positioned in bilateral weight bearing position directly in front of the NWB with approximately 100 cm to the screen of a laptop computer. For the hammering test, (Fig. 2b) participants were seated on a standard chair with the NWB in front of their fists with the laptop screen approximately 80 cm away from their eyes. In both tests, participants were instructed to react as fast as possible according to a visual stimulus displayed on the computer. The visual stimulus was presented as a green light at a random time (between 1 and 4 s) and side (either left or right) on the computer screen. The internal timer would be stopped when the appropriate side (according to the visual clue) of the NWB was hit. For each test, the operator had to restart the test immediately after the previous test, resulting in random inter-trial intervals.
The elapsed time given in milliseconds (RT) from the visual stimulus was displayed until the participant responded by stepping or hammering on the NWB was then logged and stored in a database on the computer for further analysis. At each session (1, 2 and 3) participants completed 10 consecutive step (5 left and 5 right in random order) and 10 consecutive hammering-tests (5 left and 5 right in random order) totaling 60 trials per participant in the whole study.
All statistical analyses were performed using SPSS (version 21, IBM, New York, USA). A statistician at the department of statistics, Aalborg University Hospital performed and validated the statistical models used in this study. Reaction times were determined using the developed software and expressed as mean ± SD.
Preliminary analysis of the data focused on systematic differences associated with age and session (learning effects), and a mixed effects model was applied. This model used subjects as a random effect and age, and session as fixed effects. Reproducibility was afterwards expressed for within-day and between-day by Interclass Correlation Coefficients (ICC), Coefficient of Variance (CV), and Typical Error (TE) . In addition, we reported different means and fastest values of the recordings in order to give the reader added transparency of the reproducibility. ICC was chosen to assess relative reliability and determined as between-subject variance versus total variance, and was interpreted using the following criteria: 0.00-0.39 poor, 0.40-0.59 fair, 0.60-0.74 good, and 0.75-1.00 excellent . In the present study, a two-way mixed model using absolute agreement between sessions was used to calculate ICC. This is a conservative approach since a prerequisite is that no systematic differences exist between sessions. CV was chosen to give another view on relative reliability and details of the equation used to calculate this have previously been reported . TE is an absolute measure and measures within-subject variation in RT. In this study, TE was calculated using the following equation:
The selected sample size in each group was based on recommendations from the COSMIN checklist  and experts within the field of reliability studies, which recommend around 20–50 subjects for test-retest studies [33, 34].
Prior to any tests, participants received a written and oral presentation of the experiment by the investigator and gave their written informed consent. The Danish North Jutland regional ethical committee approved the study (N-230878).
Upper extremity test
Results from the mixed effect model based on 10 trials at each session (1, 2 and 3) indicated a statistical significant effect of age, favoring the young group in RT for the upper extremity (p < 0.001; −170.7 ms (95%CI −209.4 to −132.0)). In addition, the mixed model confirmed that no significant learning effects were observed between any of the sessions in the RT test for the upper extremity in either group (Table 2 & Fig. 3).
Within-day for the upper extremities the lowest ICC value seen was .437 and the highest was .841 for the older adults and for the younger adults the lowest ICC value was .775 and the highest .927. Between-day the lowest ICC value was .633 and the highest .777 for the older adults and for the younger adults the lowest was .786 and the highest .874 (Table 3).
Lower extremity test
Results based on 10 trials from the mixed effect model indicated a statistical significant effect of age, favoring the young group in reaction time for the lower extremities (p < 0.001; −224.3 ms (95%CI −274.6 to −173.9)). In addition, the mixed effect model confirmed that no significant learning effects were observed between any sessions for the older group. This pattern was the same in the younger group between sessions 1–2, 2–3, however between session 1–3 a slight systematic statistical difference was observed (Table 4 & Fig. 4).
Within-day for the lower extremities the lowest ICC value was .837 and the highest .992 for the older adults and for the younger adults the lowest ICC value was .349 and the highest was .923. Between-day the lowest ICC value was .743 and the highest .868 for the older adults and for the younger adults the lowest was.370 and the highest .893 (Table 5).
The NWB could have potential to become a good clinical tool for assessment of RT in a wide range of populations, as it is inexpensive, portable, and reliable. Overall, the results indicate that the RT tests can differentiate older adults from younger adults. Secondly, that no learning effect within- and between day could be observed for any of the tests with exception of the lower extremity RT test between session 1 and 3 for the younger participants. Finally, that a good reproducibility could be achieved within- and between day for both the upper and lower extremity test using the fastest of three trials in both the older and younger participants.
This was the first study to explore RT measured with standard NWB using custom software. Previously, researchers have primarily measured RT using expensive laboratory equipment [8, 16–20]. The results from the current study showed that for both the upper and lower extremity test the younger adults were markedly faster on average than the older adults (−170.7 ms ~24 % difference and −224.3 ms ~23 % difference, respectively). This was anticipated and not a surprise as numerous studies have shown this previously [16, 21–23].
The mixed effect model used in the present study showed that there was no systematic effect comparing any of the sessions for the RT test in both the upper and lower extremities for both the older and younger adults. The only exception was in the lower extremity test where a significant decrease (−35,5 ms; 4,6 %) in RT was observed for the younger adults between session 1 and 3. However, in the older group a similar trend was seen between session 1 and 3 in the lower extremity test. This possible learning effect might be related to the nature of the lower extremity test, which is weight Bearing contrary to the upper extremity test. If the visual cue i.e. was presented on the left side of the computer screen and the participant was predominantly or fully weight supported on the equivalent leg (left leg) then the RT would be slower compared to if the participant had an equivalent weight distribution on the legs. If this was the case in session 1 and 2 then the participant would have to shift their weight i.e. onto the right leg in order to react appropriately to the visual stimulus (left side). In session 3 the younger adults may have adapted towards this by having a more equivalent weight distribution on their legs resulting in a faster RT compared to session 1. In the future this medio lateral swaying may be avoidable by giving clear and very specific instructions on having their weight equally distributed on their legs. Another possible explanation for the trend between session 1 and 3 in the older group might be that older adults tend to improve speed, accuracy, and consistency of their motor response between sessions [35, 36]. In support of the above and in the present study, the standard deviation in general became smaller with increased number of recordings and across sessions (time 1, 2, and 3) for both groups in the lower extremity test. This underpins that a learning effect across sessions has taken place in the lower extremity test, and that future test protocols should focus on this problem.
In the present study, a good reproducibility (ICC, CV and TE) was achieved within- and between day for both the upper and lower extremity RT test in both groups. However, in 2013 Spiteri et al.  reported slightly better Coefficient of Variation (CV) values ranging from 1.48 to 3.35 % but similar or lower ICC values ranging from .71 to .83 and for a simple lower extremity RT test in 5 young adults (University sports science students) when averaging 10 trials compared to the present study. These slightly better CV values compared to the current study might be explained by Spiteri et al. handpicking 5 out of the original 30 participants for the sub study on reproducibility. Moreover, participants were allowed two practice trials prior to each test before commencing the counting tests. In the current study, this type of participant selection did not occur and not surprisingly, we found slightly higher CV percentage values for both the young (6.2 %) and older (7.3 %) adults. However, the present study produced similar or higher ICC values compared to Spiteri et al. when averaging 10 trials. These similar or higher ICC values might be explained by the current study consisted of a much more heterogeneous study-population in both groups than the very small study-population in the Spiteri et al. study. In another reproducibility study Mercer and coworkers  evaluated a simple, inexpensive, and portable ruler-method for measuring RT on 30 community-dwelling older adults using an intra-day (20 min pause) design. They found the method to be outside of an acceptable reproducibility range as ICC was .53 between sessions. In addition, they observed a significant learning effect between sessions, which further disqualified the method. However, the Mercer study did only report ICC values with respect to reproducibility and this may have played to their disadvantage, as other statistical methods (which are effected to a minor degree by the study-population) could have shown greater reproducibility. The ICC calculation is depends on the ratio of the variability between participants to the total variability, and is thus affected by factors related to the study sample itself. The CV percentage or similar (Bland-Altman plots, Limits of Agreement) are less effected by study sample and are important to report as they will give an indication either in percent or absolute values to the measurement error of the method . This becomes of great importance when evaluating the effect of an intervention study or a rehabilitation course, as some of the potential effect achieved may/should be attribute to measurement error. Based on the measures of reliability (ICC) and agreement (CV, TE) the authors recommend using the fastest trial of three in both the upper and lower extremity RT test in both groups in order to minimize measurement error and at the same time be time efficient. However, to avoid a learning effect across days of testing it is recommended that habitation trials are applied for both young and old adults when testing RT in the lower extremities.
The strength of the current study is the availability of the NWB’s. Presently approximately 43 million Wii-boards have been sold across the world. In addition, it is a very cost-efficient and portable method compared to existing methods. Moreover, we were able to test both upper and lower extremities, and explore within- and between day reproducibility in one study. Finally, the present study concurred with internationally accepted guidelines in terms of reporting reproducibility studies as several measures of both reliability (ICC) and agreement (CV and TE) were reported . A weakness in the current study is the lack of validation. This study did not correlate results with a ‘gold standard’ within reaction time testing, which would have added to the potential future use of the test in clinical settings. Finally, the custom software prepared for the current study is not yet widely available limiting the usefulness of the study. However, in the near future the authors plan to validate the Wii-RT test against a ‘gold standard’ method and make the software widely available to clinicians and researchers.
This study found that a portable reaction test utilizing a standard Nintendo Wii board could differentiate between young and older adults in upper and lower extremities. In addition, no systematic significant differences were observed within-day or between-day for the reactions tests with exception of the lower extremity test between session one and three in the young group. A good reproducibility was observed in both the upper and lower extremity test for both the young and old group using the fastest of the three recordings. Future studies should aim at validating the Wii-RT test against a “gold standard” reaction test.
U. C. B. P. I. Office. “U.S. Census Bureau Projections Show a Slower Growing, Older, More Diverse Nation a Half Century from Now - Population - Newsroom - U.S. Census Bureau”.
Rubenstein LZ. Falls in older people: epidemiology, risk factors and strategies for prevention. Age Ageing. 2006;35 Suppl 2:ii37–41.
Cummings-Vaughn LA, Gammack JK. Falls, osteoporosis, and hip fractures. Med Clin North Am. 2011;95(3):495–506.
Lajoie Y, Gallagher S. Predicting falls within the elderly community: comparison of postural sway, reaction time, the Berg balance scale and the Activities-specific Balance Confidence (ABC) scale for comparing fallers and non-fallers. Arch Gerontol Geriatr. 2004;38(1):11–26.
Lord SR, Ward JA, Williams P, Anstey KJ. Physiological factors associated with falls in older community-dwelling women. J Am Geriatr Soc. 1994;42(10):1110–7.
Lord SR, Clark RD. Simple physiological and clinical tests for the accurate prediction of falling in older people. Gerontology. 1996;42(4):199–203.
Lord SR, Fitzpatrick RC. Choice stepping reaction time: a composite measure of falls risk in older people. J Gerontol A Biol Sci Med Sci. 2001;56(10):M627–32.
Maver SL, Dodd K, Menz H. Lower limb reaction time discriminates between multiple and single fallers. Physiother Theory Pract. 2011;27(5):329–36.
Uusi-Rasi K, Kannus P, Karinkanta S, Pasanen M, Patil R, Lamberg-Allardt C, et al. Study protocol for prevention of falls: a randomized controlled trial of effects of vitamin D and exercise on falls prevention. BMC Geriatr. 2012;12:12.
Lord SR, Menz HB, Tiedemann A. A physiological profile approach to falls risk assessment and prevention. Phys Ther. 2003;83(3):237–52.
Bisson E, Contant B, Sveistrup H, Lajoie Y. Functional balance and dual-task reaction times in older adults are improved by virtual reality and biofeedback training. CyberPsychology Behav. 2007;10(1):16–23.
Lord SR, Castell S, Corcoran J, Dayhew J, Matters B, Shan A, et al. The effect of group exercise on physical functioning and falls in frail older people living in retirement villages: a randomized, controlled trial. J Am Geriatr Soc. 2003;51(12):1685–92.
Rooks DS, Kiel DP, Parsons C, Hayes WC. Self-paced resistance training and walking exercise in community-dwelling older adults: effects on neuromotor performance. J Gerontol A Biol Sci Med Sci. 1997;52(3):M161–8.
Crabtree DA, Antrim LR. Guidelines for measuring reaction time. Percept Mot Skills. 1988;66(2):363–70.
Mercer VS, Hankins CC, Spinks AJ, Tedder DD. Reliability and validity of a clinical test of reaction time in older adults. J Geriatr Phys Ther. 2009;32(3):103–10.
Darbutas T, Juodžbalienė V, Skurvydas A, Kriščiūnas A. Dependence of reaction time and movement speed on task complexity and age. Medicina (Kaunas). 2013;49(1):18–22.
Spierer DK, Petersen RA, Duffy K. Response time to stimuli in division I soccer players. J Strength Cond Res. 2011;25(4):1134–41.
Baur H, Müller S, Hirschmüller A, Huber G, Mayer F. Reactivity, stability, and strength performance capacity in motor sports. Br J Sports Med. 2006;40(11):906–10. discussion 911.
Spierer DK, Petersen RA, Duffy K, Corcoran BM, Rawls-Martin T. Gender influence on response time to sensory stimuli. J Strength Cond Res. 2010;24(4):957–63.
Spiteri T, Cochrane JL, Nimphius S. The evaluation of a new lower-body reaction time test. J Strength Cond Res. 2013;27(1):174–80.
Anstey KJ, Dear K, Christensen H, Jorm AF. Biomarkers, health, lifestyle, and demographic variables as correlates of reaction time performance in early, middle, and late adulthood. Q J Exp Psychol A. 2005;58(1):5–21.
Der G, Deary IJ. Age and sex differences in reaction time in adulthood: results from the United Kingdom Health and Lifestyle Survey. Psychol Aging. 2006;21(1):62–73.
Dykiert D, Der G, Starr JM, Deary IJ. Age differences in intra-individual variability in simple and choice reaction time: systematic review and meta-analysis. PLoS ONE. 2012;7(10):e45759.
Jorgensen MG, Laessoe U, Hendriksen C, Bruno O, Nielsen F. Efficacy of Nintendo Wii training on mechanical leg muscle function and postural balance in community- dwelling older adults : a randomized controlled trial. Gerontol, J Sci, A Biol Sci, Med. 2012;16:1–8.
Prosperini L, Fortuna D, Giannì C, Leonardi L, Marchetti MR, Pozzilli C. Home-based balance training using the Wii balance board: a randomized, crossover pilot study in multiple sclerosis. Neurorehabil Neural Repair. 2013;27(6):516–25.
Clark RA, Bryant AL, Pua Y, McCrory P, Bennell K, Hunt M. Validity and reliability of the Nintendo Wii Balance Board for assessment of standing balance. Gait Posture. 2010;31(3):307–10.
Rpt MY, Aoyama T, Rpt MN, Rpt BT, Rpt KN, Rpt NT, et al. The reliability and preliminary validity of game-based fall risk assessment in community-dwelling older adults. Geriatr Nurs (Minneap). 2011;32(3):188–94.
Larsen LR, Jørgensen MG, Junge T, Juul-Kristensen B, Wedderkopp N. Field assessment of balance in 10 to 14 year old children, reproducibility and validity of the Nintendo Wii board. BMC Pediatr. 2014;14(1):144.
Kottner J, Audigé L, Brorson S, Donner A, Gajewski BJ, Hróbjartsson A, et al. Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed. J Clin Epidemiol. 2011;64(1):96–106.
Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. 1988.
Jørgensen MG, Laessoe U, Hendriksen C, Nielsen BFO, Aagaard P. Intra-rater reproducibility and validity of Nintendo wii balance testing in community-dwelling older adults. J Aging Phys Act. 2014;22(2):269–75.
Mokkink LB, Terwee CB, Gibbons E, Stratford PW, Alonso J, Patrick DL, et al. Inter-rater agreement and reliability of the COSMIN (COnsensus-based Standards for the selection of health status Measurement Instruments) checklist,” no. box C. 2010.
Hopkins WG, Schabort EJ, Hawley JA. Reliability of power in physical performance tests. Sports Med. 2001;31(3):211–34.
Lexell JE, Downham DY. How to assess the reliability of measurements in rehabilitation. Am J Phys Med Rehabil. 2005;84(9):719–23.
Light KE, Spirduso WW. Effects of adult aging on the movement complexity factor of response programming. J Gerontol. 1990;45(3):P107–9.
Meeuwsen HJ, Sawicki TM, Stelmach GE. Improved foot position sense as a result of repetitions in older adults. J Gerontol. 1993;48(3):P137–41.
The authors would like to thank Anders Spliid Hansen from Codetech for his assistance in developing the software for the RT test utilising the NWB.
The authors declare that they have no competing interest. Moreover, the authors have no involvement with the Nintendo corporation and have never received any kind of support from Nintendo or associated businesses.
MGJ: participated in the design of the study, acquisition, analysis and interpretation of data, and prepared the initial draft of the manuscript. SA: participated in the design of the study, the interpretation of data, and revisions of the manuscript. TM & JR participated in the design of the study, and revisions of the manuscript. SP: participated in the design of the study, acquisition of data, and revisions of the manuscript. All authors read and approved the final manuscript.