Comparative utility of frailty to a general prognostic score in identifying patients at risk for poor outcomes after aortic valve replacement

Background Current guidelines recommend considering life expectancy before aortic valve replacement (AVR). We compared the performance of a general mortality index, the Lee index, to a frailty index. Methods We conducted a prospective cohort study of 246 older adults undergoing surgical (SAVR) or transcatheter aortic valve replacement (TAVR) at a single academic medical center. We compared performance of the Lee index to a deficit accumulation frailty index (FI). Logistic regression was used to assess the association of Lee index or FI with poor outcome, defined as death or functional decline with severe symptoms at 12 months. Discrimination was assessed using C-statistics. Results In the overall cohort, 44 experienced poor outcome (31 deaths, 13 functional decline with severe symptoms). The risk of poor outcome by Lee index quartiles was 6.8% (reference), 17.9% (odds ratio [OR], 3.0; 95% confidence interval, [0.9–10.2]), 20.0% (OR 3.4; [1.0–11.4]), and 34.0% (OR 7.1; [2.2–22.6]) (p-for-trend = 0.001). Risk of poor outcome by FI quartiles was 3.6% (reference), 10.3% (OR 3.1; [0.6–15.8]), 25.0% (OR 8.8; [1.9–41.0]), and 37.3% (OR 15.8; [3.5–71.1]) (p-for-trend< 0.001). The Lee index predicted the risk of poor outcome in the SAVR cohort Lee index (quartiles 1–4: 2.1, 4.0, 15.4, and 20.0%; p-for-trend = 0.04), but not in the TAVR cohort (quartiles 1–4: 27.3, 29.0, 21.3, 35.4%; p-for-trend = 0.42). In contrast, the FI did not predict the risk of poor outcome well in the SAVR cohort (quartiles 1–4: 2.3, 4.4, 15.8, and 0%; p-for-trend = 0.24), however in the TAVR cohort (quartiles 1–4: 9.1, 14.3, 29.7, and 40.7%; p-for-trend = 0.004). Compared to the Lee index, an FI demonstrated higher C-statistics in the overall (Lee index versus FI: 0.680 versus 0.735; p = 0.03) and TAVR (0.560 versus 0.644; p = 0.03) cohorts, but not SAVR cohort (0.724 versus 0.766; p = 0.09). Conclusions While a general mortality index Lee index predicted death or functional decline with severe symptoms at 12 months well among SAVR patients, the FI derived from a multi-domain geriatric assessment better informs risk-stratification for high-risk TAVR patients.


Introduction
Aortic stenosis is a disease disproportionately affecting older adults, expected to increase in incidence with the aging population [1]. Historically, the standard of care for this population has been surgical aortic valve replacement (SAVR), however, contemporary transcatheter aortic valve replacement (TAVR) is now an option for patients with severe aortic stenosis, who have historically not been surgical candidates and thus had no interventional options. More recently, the approval of TAVR for low-risk patients has augmented procedural volumes among healthier patients [2,3]. Despite a dynamic riskprofile of the average TAVR candidate, there remain considerable challenges in determining procedural candidacy among the complex and multimorbid patients to whom this intervention was first offered [1]. The anticipated increase in procedural volumes prompt novel considerations in defining procedural candidacy and person-centered outcomes for high-risk individuals.
The American College of Cardiology (ACC) guidelines emphasize primary care provider roles to recognize, investigate, and appropriately refer for management of valvular heart disease [4]. In doing so, consideration of life-expectancy is recommended as part of evaluation for TAVR, to help determine futility [4]. Prognostic indices for mortality prediction have been developed and applied in the general older adult population [5,6]. However, the developmental cohorts differ from the population of TAVR candidates with respect to age, comorbidities, and functional status. For example, the Lee index, a well validated and widely adopted 4-10 year prognostic index for mortality was validated among community-dwelling individuals with a median age of less than 70-years [6,7]. Additionally prognostic indices incorporate demographic factors such as age and sex, and these are typically heavily weighted, which may limit discriminative ability in the oldest old populations. Lastly, prognostic indices to estimate mortality generally do not account for frailty, a state of diminished physiologic reserve, known to confer heightened vulnerability to adverse events in the setting of cardiac surgery [8][9][10]. In fact, current literature for TAVR evaluation supports risk stratification by integrating markers of frailty including gait speed and chair stands [10,11], or comprehensive geriatric assessment [11]. Nonetheless, adoption of frailty measurements remains low in this setting; the ACC-TAVR risk score does not consider any of the frailty markers [4].
Finally, current cardiac risk stratification estimates 30day mortality and major adverse cardiac events. However frail and multimorbid patients often value functional independence more than longevity [12]. Specifically, work in patients with heart failure had suggests a preference for preservation of quality of life [13], and TAVR patients have described preserving independence as a primary driving factor in their decisions [14,15]. An evolution towards predicting functional outcomes may facilitate better informed decisions among older and higher-risk candidates for AVR [10,11,15]. Thus, how to best estimate prognosis in this population to inform treatment decisions remains uncertain. In this paper we evaluated the utility of a general prognostic instrument, the Lee index, in prediction of functional decline or death following AVR [6]. We further compare its performance characteristics to a comprehensive geriatric assessment-based frailty index (FI).

Study population
We conducted a prospective cohort study of older adults undergoing AVR at the Beth Israel Deaconess Medical Center, Boston, MA, USA. Study design and protocols have been previously published [9]. We prospectively enrolled a cohort of patients, aged 70-years or older, undergoing SAVR or TAVR for severe AS at a single academic medical center. Patients were excluded for 1) emergent surgery or surgery involving the aorta or another heart valve; 2) clinical instability (such as hemodynamic instability, acute decompensated heart failure, or active myocardial ischemia); 3) Mini-Mental State Examination (MMSE) score < 15 points or active psychosis; or 4) non-English speaking. In total, between 2014 and 2016, we screened 446 patients and enrolled 246. This analysis included 91 SAVR and 137 TAVR patients with available functional status data at 12 months. None of the research data collected impacted ultimate procedural decisions. This study was approved by the Institutional Review Board and written consent was obtained.

Study measurements
A trained research assistant or research nurse interviewed patients to obtain New York Heart Association (NYHA) classification, activities of daily living (ADLs), instrumental activities of daily living (IADLs), 5 tasks in the Nagi scale, and 3 tasks in the Rosow-Breslau scale (Additional file 1: Table S1). We also measured MMSE, 5-item Geriatric Depression Scale, gait speed (m/sec) (calculated from 3 trials of 5-m walk at usual pace), and average grip strength (kg) (3 measurements using a Jamar hydraulic dynamometer in the dominant hand). A study-affiliated geriatrician reviewed medical records to extract body mass index, comorbidities, medications, and laboratory values. The Society of Thoracic Surgeons Predicted Risk of Mortality (STS-PROM) and Charlson comorbidity index were calculated.
We calculated a Lee index and FI score for each participants at the time of pre-operative assessment. The Lee index (range 0-26) is based on 12 items: age, sex, body mass index (BMI) < 25 kg/m 2 , lung disease, cancer, diabetes, congestive heart failure, current smoking, difficulty bathing, difficulty with finances, difficulty pushing or pulling large objects, and difficulty walking several blocks [6] The presence of an item assigns a given number of points, (up to 7 for age, 1 or 2 points for others). Higher points indicate a higher risk of mortality and thus worse prognosis. The FI (range 0-1) was based on the deficit accumulation model of frailty. It was calculated by the proportion of deficits among 48 items spanning 5 domains: medical comorbidities, functional limitations (ADL and IADLs), physical performance measures (gait speed, grip strength, chair stands), cognition, and nutrition (Additional file 1: Table S1) [16]. For example, if 12 deficits were present in a given individual, this individual would be assigned a FI score of 0.25 (=12/48). Greater scores indicate more advanced frailty [17].

Outcomes
Trained research assistants conducted follow-up telephone interviews. Information was obtained via mail-in questionnaire if we were unable to reach participants by phone. We ascertained vital status, NYHA class, and limitations in 22 daily activities and physical tasks. Poor outcome, our combined endpoint of interest, was defined as death, or NYHA Class III or IV (indicating symptoms at minimal activity) with functional decline at 12 months.

Statistical analysis
As TAVR patients were clinically different from SAVR patients, the cohorts were analyzed separately. However as which procedure a patient will ultimately undergo is not clear during pre-operative testing, the overall cohort was examined together as well, to provide information that may be useful for preliminary evaluation. Baseline preoperative characteristics were compared between the SAVR and TAVR cohorts using t-test or chi-square test. We created risk quartiles of the Lee index and FI based on score distributions in the combined cohort. We then calculated the percentage of patients within each risk quartile who experienced the poor outcome at 12 months and compared the proportions using a trend test. Logistic regression was used to estimate the odds ratio (OR) and 95% confidence interval (CI) of poor outcome at 12 months for both Lee index and FI quartiles in each cohort, with and without adjustment for age and sex. As a sensitivity analysis we also performed logistic regression for continuous Lee index and FI scores after standardization. We assessed discrimination for each index as a continuous variable in the combined cohort as well as SAVR and TAVR cohorts with C-statistics, compared to each other. Differences in C-statistics between models were compared with 1000 bootstrap resampling. Analyses were performed in Stata release 14 (StataCorp, College Station, TX). A 2-sided p-value < 0.05 was considered statistically significant.

Prediction of poor outcomes with FI
The risk of poor outcome in the combined cohort was 3.6% in quartile 1 (reference), 10

Comparison of model discrimination
In the combined cohort the Lee index model demonstrated improved discriminatory power over the reference models (C-statistic 0.680, Fig. 1a), but not in the SAVR (C-statistic 0.766) or TAVR (C-statistic 0.560) cohorts (Fig. 1b). The FI model demonstrated improved discriminatory power within the combined (C-statistic 0.735) and TAVR (C-statistic 0.644) cohorts, but not SAVR (C-statistic 0.724).
The FI C-statistic was significantly better than the Lee index in both the combined (p = 0.03) and TAVR (p = 0.03) cohorts after adjusting for age and sex (Fig. 1). However there was not a statistically significant difference in the C-statistics between the Lee index and FI in the SAVR cohort (p = 0.09).

Discussion
In this study of 228 older adults undergoing AVR, we evaluated the performance of a general mortality index in predicting death or functional decline with severe symptoms at 12 months. We observed a skewed distribution towards higher Lee index risk scores and an associated ceiling effect of the Lee index within the TAVR cohort. While the Lee index discriminated well among the healthier SAVR cohort, predictive performance was poor among TAVR patients. In contrast, the FI predicted risk of poor outcomes well in both groups, but its performance was uniquely better among TAVR patients.
Thus, by integrating multi-domain geriatric assessment, the FI better informs risk-stratification for TAVR candidates. Although the Lee index has been a favored prognostic index across many clinical and investigational contexts, it may not be an optimal tool to assess risk in an evolving population of complex, multimorbid, and often frail, procedural candidates. The indication of a ceiling effect of the Lee index in TAVR patients may be due to the unique characteristics of patients with severe aortic stenosis. For example, the mean age of patients within our TAVR cohort (84.4-years) is 34-years older than the average person in the Health and Retirement Study (HRS) cohort used by Lee et al. (67-years) [6]. As compared to 3% of individuals in the HRS cohort, 73.2% within our TAVR population carried a heart failure diagnosis [6]. In addition, a considerable subset of our TAVR population (80%) had at least one IADL limitation, as  In TAVR cohort (panel c), FI performed better than the Lee index compared to 12-16% of the HRS cohort [6]. The demonstrated ceiling effect of the Lee index within our cohort supports the exigency of prognostic indices that discriminate within multimorbid and or frail populations. The poor performance within the TAVR cohort additionally suggests a need for prognostic models that are also capable of finer discrimination when applied to older populations with a narrower age distribution. Lee et al. reported that age explained the majority of variability in mortality, as predicted by their model [6]. Thus, the development of the Lee index within an exclusively community-dwelling population may limit its generalizability to long-term care residents and community-dwellers at risk of new institutionalization.
In addition to its poor accuracy and external validity within older and higher-risk TAVR patients, the Lee index was not optimized to predict person-centered outcomes, such as functional status. Prediction of person-centered outcomes may be especially relevant to high-risk TAVR candidates, whose decisions must weigh sizable diseasemediated mortality with previously accumulated health and functional deficits [17,18]. In a single center analysis of patient-defined goals among TAVR candidates, only 7% of patients cited survival as their primary desired endpoint [14]. This is as compared to a majority of patients describing a desire to perform a particular activity (48%) or maintain independence (30%) [14]. As such, prognostic indices developed from the general population may also be limited in their capacity to characterize the defined priorities of higher-risk procedural candidates. Dedicated research regarding post-TAVR cognitive and functional outcomes, as well as increased representation of the oldest-old within longitudinal population health surveys, may inform more accurate and patient-centered prognostic indices for TAVR candidates.
There are limitations to this study. First, our study was conducted at a large academic medical center across a predominately Caucasian population. Therefore, the generalizability of our findings to medical centers with lower procedural volumes or distinct patient demographics merits further consideration. Second, modest sample size limits our ability to detect a potentially clinically meaningful difference in discrimination for procedure-specific cohorts. Third, our combined endpoint of death or NYHA class III or IV functional status was informed by the self-report. Nonetheless, selfreported functional status has been well-validated against objective endpoints [19]. Lastly, our analysis is predicated upon a composite outcome of symptomatic functional decline and mortality, as compared to the isolated outcome of mortality in the development of the Lee index. The use of a composite endpoint, however, captures functional outcomes, which remain often of paramount importance to older adults.

Conclusions
The peri-procedural morbidity and mortality of TAVR have declined in accordance with the recent adoption of TAVR within healthier populations, in addition to improved procedural techniques and device technology [3,20]. However, a sizable cohort of complex and vulnerable older adults will continue to require informed counseling as to their procedural risks and anticipated outcomes. Our analysis demonstrates prognostic indices developed from the general, community-dwelling population do not appropriately discriminate risk of poor outcomes among older and multimorbid procedural candidates with frailty. Explicit incorporation of frailty may better discriminate procedural risk high-risk populations, as compared to general prognostic instruments, Lee index and provide useful information for shared decision making.