- Research article
- Open Access
Edmonton frailty scale score predicts postoperative delirium: a retrospective cohort analysis
BMC Geriatrics volume 22, Article number: 585 (2022)
Frailty has been associated with postoperative delirium (POD). Studies suggest that the Fried phenotype has a stronger association with POD than the Edmonton Frailty Scale (EFS) criteria. Although phenotypic frailty is recognized as a good predictor of delirium, the EFS has higher ratings for feasibility in the surgical setting. Thus, our aim was to determine the association between EFS-assessed vulnerability and POD in an elective surgical population of older adults. A secondary aim was to determine which domains assessed by the EFS were closely associated with POD.
After IRB approval was received, electronic medical records of surgical patients at our institution were downloaded from 12/1/2018 to 3/1/2020. Inclusion criteria included age ≥ 65 years, preoperative EFS assessment within 6 months of surgery, elective surgery not scheduled for intensive care unit (ICU) stay but followed by at least 1 day postoperative stay, and at least two in-hospital evaluations with the 4 A’s test (arousal, attention, abbreviated mental test-4, acute change [4AT]) on the surgical ward. Vulnerability was determined by EFS score ≥ 6. Patients were stratified into two groups according to highest postoperative 4AT score: 0–3 (no POD) and ≥ 4 (POD). Odds of POD associated with EFS score ≥ 6 were evaluated by using logistic regression adjusted for potential confounders.
The dataset included 324 patients. Vulnerability was associated with higher incidence of POD (p = 0.0007, Fisher’s exact). EFS ≥6 was consistently associated with POD in all bivariate models. Vulnerability predicted POD in multivariable modeling (OR = 3.5, 95% CI 1.1 to 11.5). Multivariable analysis of EFS domains revealed an overall trend in which higher scores per domain had a higher odds for POD. The strongest association occurred with presence of incontinence (OR = 3.8, 95% CI 1.2 to 11.0).
EFS criteria for vulnerability predict POD in older, non-ICU patients undergoing elective surgery.
Frailty, as determined by either phenotypic or deficit accumulation instruments, has been associated with a higher incidence of post-operative delirium (POD), defined as an acute confusional state characterized by inattention, abnormal level of consciousness, thought disorganization, and a fluctuating course that happens after an older adult has an operation (surgery) . Indeed, a recent meta-analysis that compared POD incidence in frail versus non-frail older patients undergoing elective surgery supported these findings, reporting an adjusted odds ratio (OR) estimate of 2.14 (95% confidence interval [CI] = 1.43–3.19) . However, this meta-analysis combined studies that used either phenotypic or deficit accumulation criteria. In addition, the range of POD incidence for the included studies was 7 to 56% owing to differences in patients, surgical procedures, and surgical risks . Although frailty is generally recognized as a predictor of delirium, it is less clear what POD risk level is associated with frailty for older patients undergoing in-patient elective surgeries that do not require postoperative intensive care unit (ICU) care.
Many frailty assessment tools, both physical phenotype and deficit accumulation, strongly predict POD in older adults . Recent meta-analysis suggests that the physical frailty phenotype has the strongest association with POD . However, that study was not a head-to-head comparison and was underpowered, but it supports the notion that deficit accumulation frailty screening tools, such as the frailty index (FI) or Edmonton Frailty Scale (EFS), may underestimate the association between frailty and POD. Therefore, another open question is whether frailty screening with deficit accumulation instruments is associated with POD risk in lower-risk surgical settings.
On the other hand, deficit accumulation frailty instruments, such as FI and EFS, have predominately positive feasibility ratings compared to the frailty physical phenotype , and are commonly used in geriatrics to screen for underlying vulnerability . Practitioners need an easy screening test for frailty or high-risk conditions that would alert the perioperative team to perform a more rigorous evaluation in the form of a comprehensive geriatric assessment. The FI and EFS are comparable in both feasibility and their association with POD . However, EFS does require training for measurement of its physical components. At our institution EFS is used in the surgical clinics for preoperative frailty screening. This decision was based on the difference in reported time requirement for completion of EFS (< 5 min) vs FI (10–12.5 min) .
The EFS assesses for multiple domains including cognition, hospital admissions and general health, ADL needs, social support, polypharmacy and forgetting to take medications, weight loss, depression, incontinence, and level of function . Although the EFS does not specifically assess for delirium risk, the American College of Surgeons best practice guidelines for geriatric patient care recommend screening for several POD risk factors contained within the EFS domains . Because the EFS assesses for many of the delirium risk factors contained in the American College of Surgeon’s screening recommendations, we hypothesize that vulnerability, as detected by the EFS, will be an independent risk factor for POD. Given its feasibility and use in our clinical practice, we wanted to know whether EFS-determined vulnerability is a predictor of POD risk in older patients undergoing lower-risk surgery. Our aim was to determine the association between EFS-assessed vulnerability and POD in a sample of older, non-ICU patients undergoing lower-risk elective surgery. A secondary aim was to determine which domains assessed by the EFS were closely associated with POD in this population.
After receiving IRB approval, which included a waiver of consent requirements, we downloaded data from the electronic medical record (EMR) for patients who underwent surgery at our institution between December 1, 2018, and March 1, 2020. During this time period 83 eligible vulnerable subjects (EFS ≥ 6) were identified. Downloads contained data on demographics, medication usage, vital signs, and nursing documentation. The data security protocol was reviewed and approved by the institutional Data Trust.
The following inclusion criteria were used for this retrospective cohort analysis:
Age ≥ 65 years at the time of surgery
Preoperative EFS assessment within 6 months of surgery (Note: EFS is a licensed product; waiver for use was granted by D. Rolfson)
Initial post-anesthesia recovery in the post-anesthesia care unit (PACU)
Not initially scheduled for postoperative intensive care. However, patients who were admitted to the ICU later in their hospital stay were included.
At least 1 overnight stay on the surgical ward immediately after PACU discharge.
At least two in-hospital evaluations with the 4 A’s test©  (arousal, attention, abbreviated mental test-4, acute change [4AT]) during the patient’s stay on the surgical ward. (Note: The 4AT policy allows free downloads, use, and copying as required for non-commercial or research use).
Since March 1, 2018, patients ≥65 years of age in the surgical clinics at our institution have undergone preoperative screening for vulnerability with the EFS. To implement EFS screening, all surgery clinic nursing personnel (n = 18) underwent standardized in-service training for EFS administration followed by competency testing. A patient with an EFS score ≥ 6 is defined as vulnerable. Since December 1, 2018, the nursing staff has documented in-hospital delirium assessments on the surgical ward at least once every 12 hours using the 4AT score in all surgical patients ≥65 years of age. At our institution, a 4AT score ≥ 4 after surgery is considered a positive screen for POD. The 4AT has been validated as a screening instrument for delirium with 84.9% specificity and 89.7% sensitivity on the hospital ward  and 99.2% specificity and 95.5% sensitivity in the PACU . In our study population, a small number of patients with prolonged hospital stays were admitted to the ICU after their initial PACU and ward admission. In these cases, the CAM-ICU© score , routinely obtained by the ICU nursing staff each 8-hour shift, was included in the analysis. EFS, 4AT, and CAM-ICU scores, as well as their individual components, are documented in the EMR. (Note: CAM-ICU Copyright© E.Wesley Ely,MD, MPH and Vanderbilt University, all rights reserved, use is unrestricted and does not require written permission).
Baseline characteristics of eligible patients were described as frequency count and percentage for categorical variables, and as mean and standard deviation (SD), and as median and interquartile range (IQR) when informative, for continuous variables. Patients were stratified into two groups according to highest postoperative 4AT score: 0–3 and ≥ 4, during PACU stay and postoperative hospitalization. Associations between baseline characteristics and 4AT categories were evaluated with a chi-square test or Fisher’s exact test as appropriate for categorical variables, and analysis of variance F-test or Kruskal Wallis test as appropriate for continuous variables. We used penalized likelihood  based logistic regression analyses to evaluate the association of EFS score ≥ 6 with the outcome of 4AT ≥4 during hospital stay while adjusting for potential confounders. POD confounders used in logistic regression were taken from the literature and included sex, race, age, American Society of Anesthesiologists (ASA) physical status score, and Elixhauser comorbidity scores for 30-day readmission and in-hospital mortality . Levels of surgical stress were inferred from perioperative blood transfusion totals and total anesthesia time and were included as potential confounders in the regression models. Total anesthesia time was calculated as the summation of anesthesia times for all operating room procedures during the index hospitalization. Elixhauser comorbidity 30-day readmission and in-hospital mortality scores were calculated according to the Agency for Healthcare Research and Quality (AHRQ) Healthcare Cost and Utilization Project (HCUP) instructions (hcup-us.ahrq.gov). We also evaluated the association of 4AT ≥4 during hospital stay with length of hospital stay, hospital discharge disposition, and 30-day mortality after hospital discharge. We carried out additional multivariable logistic modeling to determine relationships between EFS domains and POD.
Based on our practice volume and past data, it seemed feasible to enroll 320 eligible patients. Targeting this sample size, and assuming that 25% (n = 80) would have an EFS score ≥ 6, we calculated that this study would have 85% power to detect a between-group difference of 25% in-hospital incidence of 4AT ≥4 in the EFS ≥6 group versus 10% in-hospital incidence of 4AT ≥4 in the EFS < 6 group (i.e., OR = 3) using a 2-sided z-test with type I error of 0.05.
During the time period analyzed, 393 patients ≥65 years underwent elective procedures that required at least 1 overnight non-ICU stay and were screened preoperatively for EFS. Of these 393 subjects, 324 had complete EMR datasets (Fig. 1). Our sample included 83 (25.6%) vulnerable patients (EFS ≥6), of which 10 (12.1%) had post operative 4AT scores ≥4. Of the 241 (74.4%) patients without vulnerability, 5 (2.1%) had post operative 4AT scores ≥4 (p = 0.0007, Fisher’s exact).
In univariate analysis (Table 1), a higher incidence of 4AT scores ≥4 was associated with increased age and greater comorbidity, as reflected in both the higher ASA score and greater Elixhauser 30-day readmission score.
In a series of bivariable analyses for 4AT score ≥ 4, predictors incorporated EFS ≥6, and an additional significant univariate predictor showed that EFS was consistently associated with 4AT score ≥ 4 in all bivariable models (Table 2). Age was not significant when adjusted for EFS. However, both measures of comorbidity (ASA and Elixhauser) maintained significance after EFS adjustment. Only EFS and Elixhauser maintained significance in multivariable modeling.
As shown in Table 1, length of stay (LOS) was greater among patients with 4AT score ≥ 4. Among the 250 patients who had LOS < 5 days, 62 were frail, and 5 of those (8.1%) screened positive for POD during their hospital stay. In contrast, 4 of 188 (2.1%) who were not frail before the surgery screened positive for POD (OR = 3.92; 95% CI: 1.08 to 14.21; P = 0.037). LOS did not modify the association between POD and frailty. Patients with 4AT score ≥ 4 were more likely to have requirements for skilled nursing or rehabilitation facilities on hospital discharge.
Four of the 324 patients died within 30 days postoperatively. Thirty-day mortality was associated with 4AT score ≥ 4 as well as increased Elixhauser mortality score.
Analysis of specific EFS domains
We conducted multivariable analysis of each EFS domain and adjusted for age, ASA, and Elixhauser comorbidity score (Table 3). The data showed an overall trend in which higher scores per domain had a higher odds ratio for postoperative 4AT scores ≥4. Strong associations occurred with presence of incontinence, timed get up and go, decreased functional independence, previous hospital admissions, and forgetting to take medications. Of interest, difficulty with clock draw did not have as strong an odds ratio estimate, although the 95% confidence intervals for these association estimates were generally wide.
This study shows that EFS-determined vulnerability is a predictor of postoperative in-hospital 4AT scores ≥4 in older non-ICU patients undergoing elective surgery. Among the EFS domains, the strongest associations with in-hospital 4AT scores ≥4 were requirements for assistance with activities of daily living, presence of incontinence, difficulty with timed get up and go, and forgetting to take medications. EFS criteria for vulnerability are predictors of POD in older surgical patients undergoing lower-risk procedures.
In general, most frailty instruments demonstrate associations between frailty and poorer outcomes . For instance, frailty instruments add predictive value for death, new disability, and LOS after major elective surgery . However, surgical outcome studies vary considerably in both type of frailty instrument used and the frailty incidence detected. Unfortunately, there are few head-to-head comparisons of frailty instruments in terms of their ability to predict POD. Data from a meta-analysis that compared EFS and Fried criteria suggested stronger associations for POD with the Fried criteria . In studies providing area under the curve (AUC) data, models report an AUC of 0.695  using the modified Fried criteria, whereas a study using the Groninger criteria reported an AUC of 0.89 . Our bivariate analysis adjusted for Elixhauser comorbidity score and EFS had an AUC of 0.796. Our multivariate analysis adjusted for age, ASA category, Elixhauser comorbidity score, and EFS had an AUC of 0.833. Both of our models that incorporated EFS demonstrated excellent predictive capability for delirium using the EFS ≥ 6 cutoff.
Our study population varied considerably from those in earlier reports. Patients requiring ICU admission or urgent/emergent surgery were excluded. In addition, we included a broad range of surgical specialties. Emergency/urgent surgery, ICU admission, and procedures with high cardiac risk are all strong risk factors for delirium . Their elimination from the study population accounts for the lower delirium case index. Our power calculations assumed a higher incidence rate than we observed. However, the observed odds ratio was higher, which is consistent with our findings that EFS-determined vulnerability is associated with 4AT ≥4. The lower case index and its associated issue of power is likely important. However, the direction of the frailty effect was as expected. Longer length of hospital stay was associated with 4AT ≥4 and higher Elixhauser comorbidity score, but it did not affect the association between POD and frailty. In any case, the strong association that we found emphasizes that use of EFS criteria still gives strong predictive value for POD in older surgical patients undergoing lower-risk procedures.
We wish to drill down on the frailty-delirium relationship by dissecting which EFS items are most closely associated with delirium. On the one hand, the questions on the EFS are not granular. But overall, most EFS domains showed a trend toward predicting delirium. This trend probably accounts for the strong relationship between EFS-detected vulnerability and delirium. The EFS domains that focused on function and mobility had strong associations with delirium and are consistent with those in the literature . Urinary incontinence and delirium might be linked via need for anticholinergic administration or underlying cognitive dysfunction. However, chi-square analysis of anticholinergics with and without incontinence did not show statistical significance. Incontinence is also associated with presence of underlying neurologic disease. However, chi-square analysis of neurologic comorbidities from the Elixhauser score (paralysis, dementia, psychosis) with and without incontinence showed no significance. Thus, the association of the incontinence EFS domain with delirium occurs via some other mechanism.
Additional clinical importance of our study comes from our 30-day mortality predictor analysis (Supplemental table). In multivariate analysis, only postoperative 4AT scores ≥4 were predictive of 30-day mortality. The fact that only four deaths occurred within 30 days of surgery limits the generalizability of this result and accounts for the wide confidence intervals. Nonetheless, future studies on preventing delirium with interventions focused on delirium risk factors, such as frailty, may be important for decreasing 30-day mortality even in older populations undergoing low-risk surgery.
Strengths and limitations
Here we studied pragmatic frailty and delirium assessments. Both the EFS and 4AT are easily administered and feasible in preoperative surgical clinics and on surgical wards. Full EMR datasets were analyzed, providing opportunities for implementation of EMR-driven quality improvement dashboards. All types of elective non-ICU–requiring surgeries were included, giving a broader base to our understanding of frailty–delirium relationships. Nevertheless, some limitations must be considered. The study was retrospective in an in-hospital setting. Total anesthesia time is an imperfect indicator of surgical stress as some surgeries may take disproportionately longer, but are not necessarily more invasive. POD was assessed in a clinical setting, not up to the gold standard used in a research setting, with an implicit assumption of patients not having surgery-related POD after hospital discharge. Mild cognitive dysfunction may have been underrecognized as the clock draw is limited in its recognition of very mild dementia .
Not all wards accepting postoperative patients performed the 4AT and not all surgical clinics performed the EFS assessment preoperatively which limited our sample size and generalizability. The targeted sample size in power evaluation was based on feasibility determined by the past patient volume, but POD event rate in the evaluation was assumed according to general estimates and was too high in light of the lower surgical risk in the population being studied. The study had limited sample size resulting in a small number of cases (number of patients with 4AT score ≥ 4). We were careful in our analyses not to overfit the regression model for a small sized sample, for example, by conducting the adjusted analysis in a sequential way where we first adjusted for one relevant covariate at a time only (model 1 to 3 in Table 2). To address small sample bias in logistic regression analysis, we did use penalized likelihood approach  for our analyses. The limited sample size and lower number of events did not allow for extensive multivariable modeling; therefore, residual confounding cannot be ruled out. The small sample size also resulted in less precise association estimate, as apparent in the wide 95% confidence intervals for the odds ratio estimate. However, the results are in keeping with previous meta-analysis results. It is also worth noting that the OR estimate of 3.5 associated with EFS ≥ 6 from the multivariable model reported in Table 2 may represent a weighted average of association levels in the range of higher EFS scores observed in our study. In an exploratory analysis using a two-linear-segment logistic regression model, the OR estimate was 1.0 (0.6 to 1.5) per 1 point increase in EFS score between score of 0 and 5, and 1.8 (1.2 to 2.9) per 1 point increase in EFS score ranged from 6 to 9, the upper limit of EFS score observed in our sample (data not shown). Larger datasets from additional research will be needed to analyze the association using EFS as an ordinal variable that it is, and further examine the association with POD beyond the EFS range observed in our data.
This study shows that vulnerability, as determined by an EFS score ≥ 6, is a strong predictor of POD in older elective surgical patients who do not require ICU admission. When used as a screening instrument for frailty, the EFS can help the provider detect subtle differences and refer these patients for further diagnostic workup. Thus, the EFS could be considered as an important preoperative assessment tool for determining POD risk in lower-risk surgical populations.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
4 A’s test
American Society of Anesthesiologists
Area under the curve
Edmonton Frailty Scale
Electronic medical record
Intensive care unit
Post-anesthesia care unit
Walston J, Buta B, Xue QL. Frailty screening and interventions: considerations for clinical practice. Clin Geriatr Med. 2018;34(1):25–38.
Gracie TJ, Caufield-Noll C, Wang NY, Sieber FE. The association of preoperative frailty and postoperative delirium: a meta-analysis. Anesth Analg. 2021;133(2):314–23.
Persico I, Cesari M, Morandi A, Haas J, Mazzola P, Zambon A, et al. Frailty and delirium in older adults: a systematic review and Meta-analysis of the literature. J Am Geriatr Soc. 2018;66(10):2022–30.
Aucoin SD, Hao M, Sohi R, Shaw J, Bentov I, Walker D, et al. Accuracy and feasibility of clinically applied frailty instruments before surgery: a systematic review and meta-analysis. Anesthesiology. 2020;133(1):78–95.
Rolfson DB, Majumdar SR, Tsuyuki RT, Tahir A, Rockwood K. Validity and reliability of the Edmonton frail scale. Age Ageing. 2006;35(5):526–9.
Mohanty S, Rosenthal RA, Russell MM, Neuman MD, Ko CY, Esnaola NF. Optimal perioperative management of the geriatric patient: a best practices guideline from the American College of Surgeons NSQIP and the American Geriatrics Society. J Am Coll Surg. 2016;222(5):930–47.
Bellelli G, Morandi A, Davis DH, Mazzola P, Turco R, Gentile S, et al. Validation of the 4AT, a new instrument for rapid delirium screening: a study in 234 hospitalised older people. Age Ageing. 2014;43(4):496–502.
Saller T, MacLullich AMJ, Schäfer ST, Crispin A, Neitzert R, Schüle C, et al. Screening for delirium after surgery: validation of the 4 A's test (4AT) in the post-anaesthesia care unit. Anaesthesia. 2019;74(10):1260–6.
Ely EW, Inouye SK, Bernard GR, Gordon S, Francis J, May L, et al. Delirium in mechanically ventilated patients: validity and reliability of the confusion assessment method for the intensive care unit (CAM-ICU). JAMA. 2001;286(21):2703–10.
Firth D. Bias reduction of maximum likelihood estimates. Biometrika. 1993;80(1):27–38.
Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8–27.
McIsaac DI, MacDonald DB, Aucoin SD. Frailty for perioperative clinicians: a narrative review. Anesth Analg. 2020;130(6):1450–60.
McIsaac DI, Harris EP, Hladkowicz E, Moloo H, Lalu MM, Bryson GL, et al. Prospective comparison of preoperative predictive performance between 3 leading frailty instruments. Anesth Analg. 2020;131(1):263–72.
Jung P, Pereira MA, Hiebert B, Song X, Rockwood K, Tangri N, et al. The impact of frailty on postoperative delirium in cardiac surgery patients. J Thorac Cardiovasc Surg. 2015;149(3):869–75.
Pol RA, van Leeuwen BL, Visser L, Izaks GJ, van den Dungen JJ, Tielliu IF, et al. Standardised frailty indicator as predictor for postoperative delirium after vascular surgery: a prospective cohort study. Eur J Vasc Endovasc Surg. 2011;42(6):824–30.
Inouye SK. Delirium in older persons. N Engl J Med. 2006;354(11):1157–65.
Powlishta KK, Von Dras DD, Stanford A, Carr DB, Tsering C, Miller JP, et al. The clock drawing test is a poor screen for very mild dementia. Neurology. 2002;59(6):898–903.
The authors thank Claire Levine, MS, ELS, scientific editor for the Johns Hopkins Department of Anesthesiology and Critical Care Medicine for her careful preparation and editing of the manuscript.
Funding was obtained through a departmental internal grant mechanism from the Johns Hopkins Department of Anesthesiology and Critical Care Medicine. Funding agency had no role in the study design, collection, analysis, and interpretation of data and in writing the manuscript.
Ethics approval and consent to participate
IRB approval was obtained through Johns Hopkins IRB Committee (IRB-X) protocol IRB00238310, which included a waiver of consent requirements. The data download and security protocol was reviewed and approved by the Johns Hopkins Institutional Data Trust. The data used in this study was anonymised before its use.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Sieber, F., Gearhart, S., Bettick, D. et al. Edmonton frailty scale score predicts postoperative delirium: a retrospective cohort analysis. BMC Geriatr 22, 585 (2022). https://doi.org/10.1186/s12877-022-03252-8
- Elective surgical procedures