Skip to main content

Development and validation of a machine learning-based fall-related injury risk prediction model using nationwide claims database in Korean community-dwelling older population

Abstract

Background

Falls impact over 25% of older adults annually, making fall prevention a critical public health focus. We aimed to develop and validate a machine learning-based prediction model for serious fall-related injuries (FRIs) among community-dwelling older adults, incorporating various medication factors.

Methods

Utilizing annual national patient sample data, we segmented outpatient older adults without FRIs in the preceding three months into development and validation cohorts based on data from 2018 and 2019, respectively. The outcome of interest was serious FRIs, which we defined operationally as incidents necessitating an emergency department visit or hospital admission, identified by the diagnostic codes of injuries that are likely associated with falls. We developed four machine-learning models (light gradient boosting machine, Catboost, eXtreme Gradient Boosting, and Random forest), along with a logistic regression model as a reference.

Results

In both cohorts, FRIs leading to hospitalization/emergency department visits occurred in approximately 2% of patients. After selecting features from initial set of 187, we retained 26, with 15 of them being medication-related. Catboost emerged as the top model, with area under the receiver operating characteristic of 0.700, along with sensitivity and specificity rates around 65%. The high-risk group showed more than threefold greater risk of FRIs than the low-risk group, and model interpretations aligned with clinical intuition.

Conclusion

We developed and validated an explainable machine-learning model for predicting serious FRIs in community-dwelling older adults. With prospective validation, this model could facilitate targeted fall prevention strategies in primary care or community-pharmacy settings.

Peer Review reports

Introduction

Falls in older adults are a major public health problem [1]. They can occur in any age, but the incidence and severity of fall and fall-related injuries increase with age [2, 3]. More than one out of four older adults fall annually, 10% of older adults reported an injury from a fall [2], and falls are a leading cause of death from unintentional injury [4]. Problems caused by falls are not limited to physical problem. Traumatic falls can develop into fear of falls, which subsequently leads to complications, such as restriction of activities, anxiety, and depression, negatively affecting an individual’s quality of life [5]. Moreover, fear of falls is an independent risk factor for falls among older adults [6]. As the population is aging and the burden of falls is expected to increase, establishing effective fall prevention strategies is an urgent task in the healthcare system.

Fall-risk-increasing drugs (FRIDs) include antihypertensives, diuretics, analgesics, antidepressants, antipsychotics, and hypnotics [7, 8]. Polypharmacy and FRIDs, especially psychotropic drugs, are the drug-related risk factors for falls. Lotta et al. performed an adjusted meta-analysis of 248 studies and found that antipsychotics, benzodiazepines, and antidepressants increased the odds of falls by 1.54 (95% confidence interval [CI], 1.28–1.85), 1.57 (95% CI, 1.43–1.74), and 1.42 (95% CI, 1.22–1.65), respectively [9]. Moreover, Dalwhani et al. observed increased incidence rates of falls with 20% and 50% higher in patients receiving > 4 and > 10 drugs, respectively [10].

Interventions for medications that increase/decrease fall risk are some of the most effective fall prevention strategies [11]. The American Geriatrics Society and British Geriatrics Society guidelines on fall prevention recommend withdrawal or minimization of psychoactive medications and total number of medications [11]. A previous study that performed a meta-analysis on 14 randomized controlled trials to evaluate the effects of medication review on fall prevention in community-dwelling older adults revealed that adjusting medications that were associated with falls could decrease the risk of falls, although the risk difference was modest [12]. However, according to a recent randomized clinical trial, which aimed to determine the clinical efficacy of a multifactorial intervention in a primary care setting on fall prevention, the multifactorial intervention did not result in a significantly lower rate of serious falls than enhanced user care among older adults with risk factors for falls [13]. The fact that there were little interventions on FRIDs could be the reason why the multifactorial intervention was not effective. In this study, only 29% of the participants who were taking FRIDs agreed to address medication-related risk factors and were the least prioritized risk factor.

Several tools have been validated and widely used to predict and prevent falls in the primary care setting [14,15,16,17,18,19,20]. The guidelines on fall prevention recommend that these tools be used to assess the risk of falling, but there is no clear guide to which tools to use [21]. Recently, with the development of technology, predictive models using advanced analytics are being actively developed, but only a limited number of studies have used machine learning to predict falls in community-dwelling older adults [22,23,24,25,26]. Ikeda et al. developed a prediction model with eXtreme Gradient Boosting (XGBoost) algorithm using prospectively collected survey data [22]. Makino et al. also used survey data and developed a decision tree model [23]. Ye et al. fitted five different machine learning algorithms using electronic health record data with features comprising demographics, clinical utilization, disease diagnosis, and medication prescriptions [24]. Mishra et al. also used electronic health record data to fit four different machine learning algorithms with features comprising gait measurements, demographics, and several geriatric assessment scores [25]. Engels et al. fitted an ensemble machine learning model using administrative claims database with features comprising demographics, fall history, and medication use [26]. However, previous studies have several key limitations, such as not considering medications as risk factors (or including only polypharmacy as a risk factor) [22, 23, 25], not attempting to interpret the model [26] or interpret the model solely based on the result of univariate odds ratio [24], and having small sample sizes that limited generalizability to the entire population [25]. In addition, no study has attempted to validate the machine learning algorithms on external cohorts with different time periods.

We aimed to develop and externally validate an interpretable machine learning-based fall-related injury (FRI) prediction model using claims database especially focusing on extensive range of medications. Using this tool, we expect to identify patients at high risk for FRIs and to provide medication intervention strategies for fall prevention in older adults.

Methods

Data source

This retrospective cohort study was conducted with the data obtained from the Korean Health Insurance Review and Assessment Service – Aged Patient Sample (HIRA-APS) databases sampled annually for the year 2018 and 2019. In Korea, the national health insurance system provides coverage for 98% of the populations, and the HIRA database contains claims data for over 90% of the population assuring generalizability of analysis [27]. The HIRA-APS dataset is a 10% stratified random sample of claims data for patients aged > 65 years and contains comprehensive information on patient demographics, disease diagnoses based on the International Statistical Classification of diseases Tenth Revision, procedures, and prescriptions details.

Cohort description

From July to September of each year, we identified older adults in the outpatient setting and set the cohort entry date as the date when the patient received a prescription for medications lasting > 30 days in ambulatory care. For robust operational definition, following criteria were applied to register patients: (a) patients were excluded if there was no ambulatory prescription prior to 6 months from cohort entry date; (b) patients were excluded if they had been hospitalized for > 150 days out of 6 months before the cohort entry date, (c) patients were excluded if evidence of recent FRI (diagnostic code of FRIs at any position) presented 3 months prior to entry date, and (d) patients who died without observation of any FRIs within 3 months from entry date were excluded (Supplementary Figure S1). To note, exclusion criteria (c) was specifically applied to reduce the misclassification of individuals undergoing treatment for previous FRI as incident fall, in line with methodologies from prior studies [28].

Outcome and follow-up

The outcome of interest was the incidence of serious FRI. We operationally defined outcome as presence of emergency department (ED) visit or admission with primary or first secondary diagnostic code of non-pathological fracture of the skull, face, cervical region, clavicle, thorax, lumbar region, humerus, forearm, pelvis, hip fibula, tibia, and ankle or brain injury or dislocation of the lumbar region, pelvis, hip, knee, shoulder, elbow, cervical region, thorax, or jaw (Supplementary Table S1). Although the operational definition was determined with reference to previous studies [29, 30], external codes indicative of FRIs could not be utilized because they were masked from the data for privacy and security reasons. Patients were followed up from entry date until either of the following, whichever occurred earlier: (a) occurrence of FRI, (b) death, and (c) study end date (the last day of each year).

Candidate features

We collected 187 candidate features previously reported as risk factors for falls and were captured at claims database (Supplementary Table S2) [9, 24, 26, 31,32,33,34,35,36,37]. They included demographics (age, sex, insurance status), healthcare utilization pattern, prior FRIs, specific diagnoses, exposure to FRIDs and other medications that increase/decrease the incidence of FRIs, drug–drug interactions, and drug–disease interactions. Demographics, medication, drug–drug interactions, and drug–disease interactions were assessed at the time of entry date (for medication exposure, fill date and days supplied were considered), whereas other features were assessed in the 6-month window before the entry date.

Machine learning algorithms and model development

In this study, we divided the patients from the 2018 database into a development cohort and those from 2019 into a validation cohort. To enhance both the accountability and the clarity of our prediction model, we selected four explainable machine learning algorithms: Random forest (RF), XGBoost, Light Gradient Boosting Machine (LightGBM), and CatBoost. Our goal was to construct a model that was not only accurate but also comprehensible in its predictive processes. Traditionally, while these decision tree ensemble models have been highly accurate, their ‘black box’ approach often hampered practical application due to a lack of interpretability. Recent advancement in interpretative frameworks have, however, considerably expanded their applicability in healthcare decision-making [38]. For comparative analysis, we included a logistic regression model as a reference.

In the initial phase with the development cohort, association among features was analyzed using Spearman’s rank correlation, and the features were filtered to ensure that there were no features with a coefficient exceeding 0.9, to avoid multicollinearity. Next, the optimal set of features was explored via sequential backward floating selection [39]. To streamline the feature selection process, we implemented two strategies: initially, we downsized the development cohort through one-sided selection to achieve a 1:4 ratio of fallers (minority class) to non-fallers (majority class). Subsequently, we employed the LightGBM model for feature selection, capitalizing on its efficiency and rapid processing capacity for large datasets. Fivefold cross-validated area under the receiver operating characteristic (AUROC) curves was used as the metrics for model assessment, and 1-standard error rule was applied to select the most parsimonious model [40]. Using this approach, we were able to eliminate features with low importance while maintaining the performance and increasing the interpretability of the model. After the selection of the final list of features, hyperparameter was tuned with the entire development cohort for each machine learning model using Optuna [41]. In total, 1,000 trials were conducted, and hyperparameter combinations with the highest AUROC were saved for each model. During this process, again, fivefold cross-validation was used. Explored parameter fields and selected parameters are shown in Supplementary Table S3.

Performance measures

All prediction performance was measured at the validation cohort. To assess discrimination performance, we measured the AUROC at 3 months. The cutoff point was determined by maximizing the Youden index [42]. We reported other metrics, including sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), at the cutoff point determined using the Youden index. In addition, cumulative incidence plot was depicted to graphically show the difference of fall risk stratified by the model’s cutoff point. Calibration was visually measured by depicting calibration plot. Finally, we used SHapley Additive exPlanations (SHAP) for model interpretation [38].

Statistical analyses

For comparison of patient characteristics, we used percentage or mean (standard deviation). The χ2 or Fisher’s exact test was applied to compare categorical variables between groups, whereas t-tests were used to compare continuous variables between groups. The Spearman rank correlation was used to analyze the correlation among the features. To investigate the association between occurrence of FRI and each feature, logistic regression was performed. The DeLong test was conducted to compare the difference of AUROC. Statistical significance was defined as p-value < 0.05. All analyses were performed using SAS version 9.4 and Python version 3.9.7.

Results

Characteristics of the development and validation cohorts

Out of a total of 1,475,818 older patients, 520,603 from 2018 dataset were registered in development cohort and 552,731 from 2019 were registered in validation cohort (Supplementary Figure S2). Although most variables showed statistically significant difference owing to large sample size, patient characteristics in the development and validation cohorts were similar; FRIs leading to hospitalization/ED visit were observed in 1.8% and 1.7% of the patients in the development and validation cohorts, respectively. Approximately 40% of the patients were male, and 6% had fall history and were taking seven medications per average (Table 1).

Table 1 Baseline characteristics of the study participants in the development and validation cohorts

Model performance

After feature selection process, 26 out of the 187 candidate features were selected. The included final features were sex, age group, insurance status, number of admission or ED visit, seven comorbidities (e.g., prior FRI, dorsopathy, hyperlipidemia), 13 medication factors (e.g., number of medications, number of central nervous system [CNS] depressants, bisphosphonate, steroid), and two drug–disease interactions (e.g., CNS depressant use in patients with a history of fracture). The final list of features and their association with future fall can be found in Supplementary Table S4. The AUROC of each model is summarized in Fig. 1A. All machine learning-based models showed higher performance than logistic regression. However, the difference in performance among all five models was negligible (AUROC, 0.700, 0.700, 0.699, 0.699, and 0.698 for CatBoost, LightGBM, XGBoost, RF, and logistic regression, respectively). Calibration plot was depicted to determine if the observed and predicted probabilities were consistent (Fig. 1B). The predicted and actual probabilities of FRIs within 90 days, divided into deciles, showed concordance across all models. CatBoost was selected as our final model owing to its highest discrimination performance among the models considered.

Fig. 1
figure 1

Discrimination and calibration performance of each model. A Receiver operating characteristic curve of each model. B Calibration plot of each model

Table 2 shows the performance measures of each model at the cutoff point determined using the Youden index. CatBoost showed sensitivity, specificity, PPV, and NPV of 64.7%, 65.2%, 1.9%, and 99.5%, respectively. On Kaplan–Meier analysis, there was a clear distinction of curves between risk groups (only observed in CatBoost) (Fig. 2), with the high-risk group showing more than three times higher risk of FRIs than the low-risk group (hazard ratio, 3.22; 95% CI, 3.09–3.36).

Table 2 Performance comparison of each model on validation cohort
Fig. 2
figure 2

Kaplan–Meier curves for cumulative incidence of fall-related injury by risk group

Model interpretation

The SHAP summary plot for CatBoost is presented in Fig. 3A, while those for other models (LightGBM, XGBoost, and RF) can be found in Supplementary Figure S3. The plot summarized the importance of features and their effects on prediction at once, with each point presenting the individual patient’s feature values and their effects on the model. The top 10 important features identified in the model were age group, sex, number of medications, dorsopathies, prior FRI, number of admission or ED visit, number of CNS depressants, hyperlipidemia, CNS depressant use with prior fracture, and exposure to acetylcholine esterase inhibitor. The model was applied to an individual patient with FRI and depicted using a SHAP waterfall plot (Fig. 3B, Figure S3). The plot represents how the prediction is made in individual patient level. Again, features were sorted in the descending order of effects on model output and also depicted their directions on prediction. Prior FRI, exposure to 18 medications, Parkinson disease, CNS depressant use with prior fracture, and exposure to two distinct CNS depressants pushed model to predict a patient will suffer from FRI, whereas male sex, absence of admission or ED visit history, and age 70–74 years pushed the model to predict a patient will unlikely to experience FRI.

Fig. 3
figure 3

Interpretation of the model output. FRI, fall-related injury; CNS, central nervous system; ED, emergency department. A SHapley Additive exPlanations (SHAP) summary plot. The color represents the value of each feature, with red representing higher values and blue representing lower values. The SHAP value on the x-axis explains the direction and degree of the model’s prediction, where large positive values contribute to the prediction that a patient will experience fall-related injury, large negative values contribute to the prediction that a patient will not experience fall-related injury, and values close to zero contribute little to the prediction. B SHAP waterfall plot. Patient level prediction is depicted. Similarly, the SHAP value on the x-axis explains the direction and degree of the model’s prediction, where large positive values contribute to the prediction that a patient will experience fall-related injury, large negative values contribute to the prediction that a patient will not experience fall-related injury, and values close to zero contribute little to the prediction

Discussion

This study developed and validated a FRI prediction model in the community-dwelling older adults using claims database. Our best performing model showed a fair ability to discriminate individuals who experienced FRI and those who did not [43] (AUROC, 0.70). By focusing on 35.1% of the patients, we could capture almost two-thirds of FRIs. Contrary to expectations, the model using machine learning algorithm only showed a slight improvement in performance compared with logistic regression. This trend is also demonstrated in a prior study conducted to predict falls with administrative claims database that shared similar characteristic of features with our study [26]. The model’s selected features and interpretation aligned well with clinical intuition. Specifically, our model predicted older adults, female sex, and prior FRI; the higher the number of CNS depressants and the higher the number of total medications, the more likely that an individual will experience FRI [34]. Our model identified dorsopathy as an important risk factor for FRIs, which is also consistent with the results of prior studies that have revealed back pain as an independent risk factor for fall [44]. Contrary to our intuition, the use of certain antihypertensives was associated with a lower risk of FRI in our study. Although the mechanism is not totally understood, similar trend has been observed in other studies [32, 45]. A meta-analysis conducted by de Vries et al. reported that beta-blockers showed protective effect against falls [32]. Ang et al.’s meta-analysis also demonstrated that beta-blockers and angiotensin-converting enzyme inhibitors were associated with lower risk of injurious fall [45]. In contrast, Butt et al. found that the incidence rate of falls was significantly higher within the first 14 days after the initiation for all classes of antihypertensives [46]. Taken together, these studies suggest that antihypertensives may increase the risk of falls in the initiation period, not in the maintenance period.

Additional care needs to be taken in interpreting this model. For instance, exposure to bisphosphonate seems to increase the risk of FRIs, but it would rather more reasonable to interpret it as the population has underlying condition with osteoporosis. Similarly, hyperlipidemia and menopause appear to be protective against falls, possibly due to the increase in bone density resulting from the use of statins or hormone replacement therapy rather than the disease itself [47, 48]. Hence, when interpreting the output of the model (which is entirely dependent on the user), it is necessary to determine whether the result is due to the influence of the medication or whether it is simply a result of the modeling process.

Our study has some limitations. First, our model’s performance was not optimal, with an AUROC of 0.70, compared with other previous machine learning-based fall prediction models (AUROC range, 0.70–0.88) [22,23,24,25,26]. This is possibly because physical examination results, such as gait and muscle strength, and laboratory values, such as bone mineral density, which are potentially key features for predicting FRIs, cannot be obtained from claims database. Second, owing to the nature of claims database, it is not known whether the individual actually took the prescribed medications. Third, while the diagnostic codes utilized for identifying FRIs are informed by prior studies [29, 30], they may not be exclusively attributable to falls. The possibility that the injuries could be from other causes, such as vehicular accidents, cannot be entirely excluded, given that the external cause of injury codes were obscured in our dataset. However, substantial evidence suggests that a significant proportion of non-intentional injuries among older adults are caused by falls. For instance, from 2016 to 2020, fall accounted for 57% of fatal unintentional injuries and 65% of non-fatal unintentional injuries in this demographic [49]. This data substantiates the likelihood that any misclassification bias in our study would not substantially affect the validity of our findings. Fourth, given the nature of HIRA-APS dataset, it is worth noting that the data are sampled annually, and there is possibility that the same patients may be included in both 2018 sample database for model development and 2019 sample database for validation. However, due to the anonymized nature of the data, we were unable to identify duplicate patients. Nevertheless, we believe that this should not significantly impact the results.

Despite these potential limitations, our prediction model is still valuable in that it was derived from a nationally representative dataset of adult population, making it more generalizable than models based on data from a single institution. Moreover, the focus on FRIs resulting in admission or ED visit as a primary outcome underscores the clinical significance of this study and may contribute to the development of fall prevention programs that improve patient outcomes. Utilizing a claims database, our model benefits from automated data acquisition, which facilitates the identification of populations at high-risk for FRIs without additional assessment.

Our model was designed with the intention of serving as a national surveillance tool for monitoring fall-related injuries in South Korea, where the Health Insurance Review and Assessment Service (HIRA) operates a Drug Utilization Review (DUR) system. This system is instrumental in providing real-time alerts to healthcare providers about critical issues like contraindicated drug interactions, redundant prescribing, age-related contraindications, and excessive dosage [50]. Given that our model is constructed exclusively from claims data, it is conceivable that HIRA could integrate our predictive model into the DUR system to enhance its functionality. Such an advancement would allow for the automatic and real-time processing of data to pinpoint high-risk individuals, thus facilitating proactive education and timely interventions for fall-related injuries, greatly contributing to patient safety and care. Furthermore, our study stands out as the only available prediction model for FRIs in community-dwelling older adults that has been evaluated in an external validation cohort with different time periods, whereas previous studies only underwent internal validation using the random split-sample method and cross-validation.

Conclusions

We developed and externally validated a novel explainable machine learning-based FRI prediction model using national sample claims database. We found that applying machine learning approach to predict FRIs in older adult is feasible. Although the performance is not optimal, simple and ready-to-use claims data-driven model can be utilized in routine primary care practice or community pharmacy for targeted intervention. Further prospective study is required to evaluate and validate the usefulness of the model in the clinical field.

Availability of data and materials

The datasets used in the study can be accessed from the Health Insurance Review and Assessment service, but their use is limited due to licensing and not intended for public release. However, data will be shared on reasonable request to the corresponding author with the permission of the Health Insurance Review and Assessment service.

Abbreviations

FRIDs:

Fall-risk-increasing drugs

FRI:

Fall-related injury

CI:

Confidence interval

XGBoost:

Extreme gradient boosting

HIRA:

Health insurance review and assessment service

APS:

Aged patient sample

ED:

Emergency department

RF:

Random forest

LightGBM:

Light gradient boosting machine

AUROC:

Area under the receiver operating characteristic

PPV:

Positive predictive value

NPV:

Negative predictive value

SHAP:

SHapley Additive exPlanations

CNS:

Central nervous system

References

  1. Peel NM. Epidemiology of falls in older age. Can J Aging. 2011;30:7–19.

    Article  PubMed  Google Scholar 

  2. Moreland B, Kakara R, Henry A. Trends in nonfatal falls and fall-related injuries among adults aged >/=65 years - United States, 2012–2018. MMWR Morb Mortal Wkly Rep. 2020;69:875–81.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Ghosh M, O’Connell B, Afrifa-Yamoah E, Kitchen S, Coventry L. A retrospective cohort study of factors associated with severity of falls in hospital patients. Sci Rep. 2022;12:12266.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Burns E, Kakara R. Deaths from falls among persons aged ≥65 years - United States, 2007–2016. MMWR Morb Mortal Wkly Rep. 2018;67:509–14.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Schoene D, Heller C, Aung YN, Sieber CC, Kemmler W, Freiberger E. A systematic review on the influence of fear of falling on quality of life in older people: is there a role for falls? Clin Interv Aging. 2019;14:701–19.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Gazibara T, Kurtagic I, Kisic-Tepavcevic D, Nurkovic S, Kovacevic N, Gazibara T, Pekmezovic T. Falls, risk factors and fear of falling among persons older than 65 years of age. Psychogeriatrics. 2017;17:215–23.

    Article  PubMed  Google Scholar 

  7. De Winter S, Vanwynsberghe S, Foulon V, Dejaeger E, Flamaing J, Sermon A, Van der Linden L, Spriet I. Exploring the relationship between fall risk-increasing drugs and fall-related fractures. Int J Clin Pharm. 2016;38:243–51.

    Article  PubMed  Google Scholar 

  8. Milos V, Bondesson A, Magnusson M, Jakobsson U, Westerlund T, Midlov P. Fall risk-increasing drugs and falls: a cross-sectional study among elderly patients in primary care. BMC Geriatr. 2014;14:40.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Seppala LJ, Wermelink A, de Vries M, Ploegmakers KJ, van de Glind EMM, Daams JG, van der Velde N, task E, Finish group on fall-risk-increasing d. Fall-risk-increasing drugs: a systematic review and meta-analysis: II. Psychotropics. J Am Med Dir Assoc. 2018;19:371 e11-e17.

    Article  PubMed  Google Scholar 

  10. Dhalwani NN, Fahami R, Sathanapally H, Seidu S, Davies MJ, Khunti K. Association between polypharmacy and falls in older adults: a longitudinal study from England. BMJ Open. 2017;7:e016358.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Panel on Prevention of Falls in Older Persons AGS, British Geriatrics S. Summary of the Updated American Geriatrics Society/British Geriatrics Society clinical practice guideline for prevention of falls in older persons. J Am Geriatr Soc. 2011;59:148–57.

    Article  Google Scholar 

  12. Ming Y, Zecevic AA, Hunter SW, Miao W, Tirona RG. Medication review in preventing older adults’ fall-related injury: a systematic review & meta-analysis. Can Geriatr J. 2021;24:237–50.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Bhasin S, Gill TM, Reuben DB, Latham NK, Ganz DA, Greene EJ, Dziura J, Basaria S, Gurwitz JH, Dykes PC, et al. A randomized trial of a multifactorial strategy to prevent serious fall injuries. N Engl J Med. 2020;383:129–40.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Tinetti ME. Performance-oriented assessment of mobility problems in elderly patients. J Am Geriatr Soc. 1986;34:119–26.

    Article  CAS  PubMed  Google Scholar 

  15. Podsiadlo D, Richardson S. The timed “Up & Go”: a test of basic functional mobility for frail elderly persons. J Am Geriatr Soc. 1991;39:142–8.

    Article  CAS  PubMed  Google Scholar 

  16. Guralnik JM, Simonsick EM, Ferrucci L, Glynn RJ, Berkman LF, Blazer DG, Scherr PA, Wallace RB. A short physical performance battery assessing lower extremity function: association with self-reported disability and prediction of mortality and nursing home admission. J Gerontol. 1994;49:M85-94.

    Article  CAS  PubMed  Google Scholar 

  17. Franchignoni F, Horak F, Godi M, Nardone A, Giordano A. Using psychometric techniques to improve the Balance Evaluation Systems Test: the mini-BESTest. J Rehabil Med. 2010;42:323–31.

    Article  PubMed  Google Scholar 

  18. Kempen GI, Yardley L, van Haastregt JC, Zijlstra GA, Beyer N, Hauer K, Todd C. The Short FES-I: a shortened version of the falls efficacy scale-international to assess fear of falling. Age Ageing. 2008;37:45–50.

    Article  PubMed  Google Scholar 

  19. Meekes WM, Korevaar JC, Leemrijse CJ, van de Goor IA. Practical and validated tool to assess falls risk in the primary care setting: a systematic review. BMJ Open. 2021;11:e045431.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Stevens JA. The STEADI Tool Kit: a fall prevention resource for health care providers. IHS Prim Care Provid. 2013;39:162–6.

    PubMed  PubMed Central  Google Scholar 

  21. Montero-Odasso M, van der Velde N, Martin FC, Petrovic M, Tan MP, Ryg J, Aguilar-Navarro S, Alexander NB, Becker C, Blain H, et al. World guidelines for falls prevention and management for older adults: a global initiative. Age Ageing. 2022;51(9):afac205.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Ikeda T, Cooray U, Hariyama M, Aida J, Kondo K, Murakami M, Osaka K. An interpretable machine learning approach to predict fall risk among community-dwelling older adults: a three-year longitudinal study. J Gen Intern Med. 2022;37:2727–35.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Makino K, Lee S, Bae S, Chiba I, Harada K, Katayama O, Tomida K, Morikawa M, Shimada H. Simplified decision-tree algorithm to predict falls for community-dwelling older adults. J Clin Med. 2021;10:5184.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Ye C, Li J, Hao S, Liu M, Jin H, Zheng L, Xia M, Jin B, Zhu C, Alfreds ST, et al. Identification of elders at higher risk for fall with statewide electronic health records and a machine learning algorithm. Int J Med Inform. 2020;137:104105.

    Article  PubMed  Google Scholar 

  25. Mishra AK, Skubic M, Despins LA, Popescu M, Keller J, Rantz M, Abbott C, Enayati M, Shalini S, Miller S. Explainable fall risk prediction in older adults using gait and geriatric assessments. Front Digit Health. 2022;4:869812.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Engels A, Reber KC, Lindlbauer I, Rapp K, Buchele G, Klenk J, Meid A, Becker C, Konig HH. Osteoporotic hip fracture prediction from risk factors available in administrative claims data - A machine learning approach. PLoS One. 2020;15:e0232969.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Kim L, Kim JA, Kim S. A guide for the utilization of health insurance review and assessment service national patient samples. Epidemiol Health. 2014;36:e2014008.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Wright NC, Daigle SG, Melton ME, Delzell ES, Balasubramanian A, Curtis JR. The design and validation of a new algorithm to identify incident fractures in administrative claims data. J Bone Miner Res. 2019;34:1798–807.

    Article  PubMed  Google Scholar 

  29. Tinetti ME, Han L, Lee DS, McAvay GJ, Peduzzi P, Gross CP, Zhou B, Lin H. Antihypertensive medications and serious fall injuries in a nationally representative sample of older adults. JAMA Intern Med. 2014;174:588–95.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Mintz J, Duprey MS, Zullo AR, Lee Y, Kiel DP, Daiello LA, Rodriguez KE, Venkatesh AK, Berry SD. Identification of fall-related injuries in nursing home residents using administrative claims data. J Gerontol A Biol Sci Med Sci. 2022;77:1421–9.

    Article  PubMed  Google Scholar 

  31. Homer ML, Palmer NP, Fox KP, Armstrong J, Mandl KD. Predicting falls in people aged 65 years and older from insurance claims. Am J Med. 2017;130(744):e17–23.

    Google Scholar 

  32. de Vries M, Seppala LJ, Daams JG, van de Glind EMM, Masud T, van der Velde N, Task E, Finish Group on Fall-Risk-Increasing D. Fall-risk-increasing drugs: a systematic review and meta-analysis: I. Cardiovascular drugs. J Am Med Dir Assoc. 2018;19(371):e1–9.

    Google Scholar 

  33. Seppala LJ, van de Glind EMM, Daams JG, Ploegmakers KJ, de Vries M, Wermelink A, van der Velde N. Fall-risk-increasing drugs: a systematic review and meta-analysis: III. Others. J Am Med Dir Assoc. 2018;19:372.e1-e8.

    Article  PubMed  Google Scholar 

  34. Deandrea S, Lucenteforte E, Bravi F, Foschi R, La Vecchia C, Negri E. Risk factors for falls in community-dwelling older people: a systematic review and meta-analysis. Epidemiology. 2010;21:658–68.

    Article  PubMed  Google Scholar 

  35. American Geriatrics Society. Updated AGS beers criteria® for potentially inappropriate medication use in older adults. J Am Geriatr Soc. 2019;2019(67):674–94.

    Google Scholar 

  36. O’Mahony D, O’Sullivan D, Byrne S, O’Connor MN, Ryan C, Gallagher P. STOPP/START criteria for potentially inappropriate prescribing in older people: version 2. Age Ageing. 2015;44:213–8.

    Article  PubMed  Google Scholar 

  37. Gade GV, Jorgensen MG, Ryg J, Riis J, Thomsen K, Masud T, Andersen S. Predicting falls in community-dwelling older adults: a systematic review of prognostic models. BMJ Open. 2021;11:e044170.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Lundberg S, Lee S-I. A Unified approach to interpreting model predictions. 2017.

    Google Scholar 

  39. Pudil P, Novovičová J, Kittler J. Floating search methods in feature selection. Pattern Recogn Lett. 1994;15:1119–25.

    Article  Google Scholar 

  40. Hastie T, Tibshirani R, Friedman JH, Friedman JH. The elements of statistical learning: data mining, inference, and prediction, vol. 2. New York: Springer; 2009.

  41. Akiba T, Sano S, Yanase T, Ohta T, Koyama M. Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage, AK: Association for Computing Machinery; 2019. p. 2623–31.

    Chapter  Google Scholar 

  42. Youden WJ. Index for rating diagnostic tests. Cancer. 1950;3:32–5.

    Article  CAS  PubMed  Google Scholar 

  43. Muller MP, Tomlinson G, Marrie TJ, Tang P, McGeer A, Low DE, Detsky AS, Gold WL. Can routine laboratory tests discriminate between severe acute respiratory syndrome and other causes of community-acquired pneumonia? Clin Infect Dis. 2005;40:1079–86.

    Article  PubMed  Google Scholar 

  44. Marshall LM, Litwack-Harrison S, Makris UE, Kado DM, Cawthon PM, Deyo RA, Carlson NL, Nevitt MC. A prospective study of back pain and risk of falls among older community-dwelling men. J Gerontol A Biol Sci Med Sci. 2017;72:1264–9.

    PubMed  Google Scholar 

  45. Ang HT, Lim KK, Kwan YH, Tan PS, Yap KZ, Banu Z, Tan CS, Fong W, Thumboo J, Ostbye T, et al. A systematic review and meta-analyses of the association between anti-hypertensive classes and the risk of falls among older adults. Drugs Aging. 2018;35:625–35.

    Article  PubMed  Google Scholar 

  46. Butt DA, Mamdani M, Austin PC, Tu K, Gomes T, Glazier RH. The risk of falls on initiation of antihypertensive drugs in the elderly. Osteoporos Int. 2013;24:2649–57.

    Article  CAS  PubMed  Google Scholar 

  47. Wang Z, Li Y, Zhou F, Piao Z, Hao J. Effects of statins on bone mineral density and fracture risk: a PRISMA-compliant systematic review and meta-analysis. Medicine (Baltimore). 2016;95:e3042.

    Article  CAS  PubMed  Google Scholar 

  48. Zhu L, Jiang X, Sun Y, Shu W. Effect of hormone therapy on the risk of bone fractures: a systematic review and meta-analysis of randomized controlled trials. Menopause. 2016;23:461–70.

    Article  PubMed  Google Scholar 

  49. Centers for Disease Control and Prevention (CDC), Web-Based Injury Statistics Query and Reporting System (WISQARS™). Available online: https://www.cdc.gov/injury/wisqars/index.html. Accessed 10 Nov 2023.

  50. Kim DS, Je NK, Park J, Lee S. Effect of nationwide concurrent drug utilization review program on drug-drug interactions and related health outcome. Int J Qual Health Care. 2021;28(33):mzab118.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (2020R1A2C110097111) and the Creative-Pioneering Researchers Program through Seoul National University.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Ju-Yeun Lee, Seung-Bo Lee; Methodology: Kyu-Nam Heo, Jeong Yeon Seok, Young-Mi Ah, Kwang-il Kim; Investigation: Kyu-Nam Heo; Data curation: Jeong Yeon Seok; Formal analysis: Kyu-Nam Heo, Jeong Yeon Seok; Visualization: Kyu-Nam Heo, Jeong Yeon Seok; Validation: Young-Mi Ah, Kwang-il Kim, Ju-Yeun Lee, Seung Bo Lee; Writing-original and revised draft preparation: Kyu-Nam Heo, Jeong Yeon Seok; Writing-review &; editing; Young-Mi Ah, Kwang-il Kim, Seung-Bo Lee, Ju-Yeun Lee; Supervision: Seung-Bo Lee, Ju-Yeun Lee; Funding acquisition: Ju-Yeun Lee. The first draft of the manuscript was written by Kyu-Nam Heo and Jung Yeon Seok, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Seung-Bo Lee or Ju-Yeun Lee.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Seoul National University Institutional Review Board (IRB No. E2212/004-007). The need for informed consent was waived by the Seoul National University Review Board, as only de-identified information was provided with no linkable data elements. All methods were carried out in accordance with the declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Figure S1.

Graphical depiction of entry date, assessment window and follow-up. Figure S2. Patient selection flow. Figure S3. SHAP summary plot for LightGBM, XGBoost, and Random Forest. Table S1. Diagnostic codes to identify fall-related injuries. Table S2. Summary of candidate features (n=187). Table S3. Explored parameter fields and selected parameters. Table S4. Association between selected features and fall-related injury.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Heo, KN., Seok, J.Y., Ah, YM. et al. Development and validation of a machine learning-based fall-related injury risk prediction model using nationwide claims database in Korean community-dwelling older population. BMC Geriatr 23, 830 (2023). https://doi.org/10.1186/s12877-023-04523-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12877-023-04523-8

Keywords