Detection of potentially inappropriate prescribing in the very old: cross-sectional analysis of the data from the BELFRAIL observational cohort study

Background Little is known about the prevalence and clinical importance of potentially inappropriate prescribing instances (PIPs) in the very old (>80 years). The main objective was to describe the prevalence of PIPs according to START (Screening Tool to Alert doctors to Right Treatment; omissions) and,STOPP (Screening Tool of Older Person’s Prescriptions; over/misuse) and the Beers list (over/misuse). Secondary objectives were to identify determinants if PIPs and to assess the clinical importance to modify the treatment in case of PIPs. Methods Cross-sectional analysis of baseline data of the BELFRAIL cohort, which included 567 Belgian patients aged 80 and older in primary care. Two independent researchers applied the screening tools to the study population to detect PIPs. Next, a multidisciplinary panel of experts rated the clinical importance of the PIPs on a subsample of 50 patients. Results In this very old population (median age 84 years, 63 % female), the screening detected START-PIPs in 59 % of patients, STOPP-PIPs in 41 % and Beers-PIPs in 32 %. Assessment of the clinical importance revealed that the most frequent PIPs were of moderate or major importance. In 28 % of the subsample, the relevance of the PIP was challenged by the global medical, functional and social background of the patient hence the validity of some criteria was questioned. Conclusion Potentially inappropriate prescribing is highly prevalent in the very old. A good understanding of the patients’ medical, functional and social context is crucial to assess the actual appropriateness of drug treatment.


Background
Potentially inappropriate prescribing (PIP) is highly prevalent in older adults and has been associated with adverse drug events, hospitalization and death [1][2][3][4]. A review reported that the median rate of inappropriate prescribing in primary care was around 20 % in patients aged over 65 years old [5]. But little is known about the prevalence of inappropriate prescribing in the very old, yet the latter represent a challenge for healthcare, because of multiple comorbidities, polypharmacy, frailty features and increased sensitivity to adverse drug events.
Several approaches exist to detect and reduce the burden of inappropriate prescribing in elderly, including the use of criterion-based screening tools, also called explicit tools [2,6]. The most studied explicit tool is the Beers list, which was first published in 1991 [7] and last updated in 2012 [8]. In recent years, another explicit tool, the Screening Tool of Older Person's Prescriptions (STOPP) and Screening Tool to Alert doctors to Right Treatment (START) [9], has been increasingly used [1]. The tool aims at detecting PIP in patients aged over 65 years old, which was the population target of most of the published studies using this tool. While Beers and STOPP address over-and misuse of inappropriate medications, the START tool allows for the detection of potentially inappropriate drug omissions. Some overlap of content between the STOPP and the 2012 Beers criteria has been described [10][11][12][13]. In the very old, to the best of our knowledge, no comparison with the updated 2012 Beers list has been performed. A recent study including inpatients aged 85 years and over measured the prevalence of PIPs according to STOPP&START and to a previous version of the Beers criteria (2003) [13]. This study showed a high prevalence of PIPs in that very old population (STOPP: 63 %; START: 54 %, Beers: 47 %) [13]. Moreover, the actual clinical relevance of those tools in the very old is unknown. Indeed, previous studies showed that potentially inappropriate prescribing detected by screening tools might differ from actually inappropriate prescribing [14][15][16].
The primary objective of this study was to determine the prevalence of PIP in community-dwelling patients aged 80 and older (the BELFRAIL population) [17] according to START (START-PIP), STOPP (STOPP-PIP) and Beers (Beers-PIP) tools.
Secondary objectives included the identification of determinants of PIPs in this population, and assessment of the clinical importance of a subsample of PIPs.

Study design, setting and participants
We performed a cross-sectional analysis of the baseline data of the BELFRAIL cohort (BF C80+ ) [17]. The BELF-RAIL study is a prospective, observational, populationbased cohort study of Belgian subjects aged 80 years and older [17]. The subjects were recruited in Belgium by their general practitioners (GPs) between November 2, 2008 and September 15, 2009, as described elsewhere [17]. This cohort excluded patients with severe dementia (mini mental state examination [18] MMSE <15/30), treated in palliative care and or as medical emergency. The protocol of this study was approved by the Biomedical Ethics Committee of the Medical School of the Université catholique de Louvain (UCL) of Brussels, Belgium (B40320084685). All patients gave written informed consent. Because patients with severe dementia (MMSE < 15) were excluded, all patients were able to give the informed consent themselves.

Data collection Medical and background data
For the 567 patients of the cohort, the GPs performed a detailed medical history and clinical examination [17]. The GPs reported the complete problem list of their patients as free text. Additionally, a structured questionnaire assessed the presence of 22 chronic conditions. Two researchers coded independently all the comorbidities listed by the GPs (OD and PB). Discrepancies were discussed with a third researcher (BV) until a consensus was reached. The Cumulative Illness Rating Scale (CIRS) was calculated [19][20][21]. The CIRS counts the number of 14 body systems affected with moderate disability, morbidity or extremely severe disease [19][20][21]. (Score varies from 0 to 14). Background data collection also included: cognitive impairment measured by the MMSE (score <25 was considered as "cognitive impairment") [18], geriatric depression scale score GDS-15 (score >4 was considered as "possible depression") [22], Tinetti fall risk score (score >24 was considered as "low fall risk") [23], functional status according to the activities of daily living (ADL) score (ADL ranged between 6 and 30, lower score is related to functional dependency) [24], incontinence (reported by the GP), body mass index, familial status, and place of residence.

Drugs and inappropriate prescribing
GPs were asked to list the drugs the patient was taking on a regular basis or as needed. Drugs were coded and classified according to the Anatomical Therapeutic Chemical (ATC) classification system by one researcher (MA) [25].
Two researchers (OD and AD) independently retrospectively applied the 87 criteria of the STOPP&START tool [9] and the 3 categories of Beers' criteria (drugs to avoid, to avoid regarding certain conditions/diseases, and to use with caution) [8] to the coded medical conditions and drugs. Discrepancies were discussed until consensus. For the analysis of the secondary outcomes, the Beers drugs to use with caution were not considered.

Clinical importance
On a randomly selected subsample of 50 patients, an expert panel (a general practitioner (JD), a geriatrician (BB) and a clinical pharmacist (AS), all with research experience in the field) was asked to independently rate the actual clinical importance of the recommendations (i.e., to add the drugs suggested by START to the treatment, or to discontinue the drugs detected by STOPP or Beers). Recommendations were classified following a previously defined method as minor, moderate, major, extreme, deleterious or not applicable (description of classifications can be found in Table 4) [14,28]. Consensus on the clinical importance was reached when 2 experts agreed. To assess PIPs in the thorough context of the patient, the panel had access to the full record provided by the GP, and not only the coded conditions.

Statistical analysis
Normally distributed continuous variables were expressed as mean ± standard deviation, while not normally distributed ones were summarized using the median and the inter-quartile range [Q25;Q75]. For categorical variables, numbers and percentages were presented. A univariate analysis and a multivariate logistic regression analysis were used to identify determinants of potentially inappropriate prescribing according to each tool. Variables with a p value of <0.05 in the univariate analysis were submitted for multivariate regression analysis. A p value <0.05 was considered statistically significant in the multivariate analysis. Statistical analyses were performed using IBM SPSS Statistics 20 (SPSS Inc., Chicago, IL, USA).

Results
The characteristics of the 567 patients included at baseline in the cohort are presented in Table 1. Patients had a median age of 84 years, 63 % were female and they lived mainly at home (90 %). The most frequent comorbidities they presented were: hypertension (70 %), osteoarthritis (57 %) and ischemic disease (37 %). Eighty-one percent of the patients had at least one PIP in their medications: 59 % had START-PIPs (drug omissions), 41 % had STOPP-PIPs and 32 % had Beers-PIPs (drug overuse and/or misuse).

Inappropriate prescribing
Overall, we found 1.13 ± 1.34 START-PIP per patient; range 0-8. In the 59 % of patients having at least one START-PIP, the average of START-PIPs rose to 1.90 ± 1.25 per affected patient.  Abbreviations: ADL activities of daily living, BMI body mass index, CIRS cumulative illness rating scale, COPD chronic obstructive pulmonary disease, GDS geriatric depression scale, GFR glomerular filtration rate, MMSE mini mental state examination, PIPs potentially inappropriate prescribing a ADL ranged between 6 and 30, lower score is related to functional dependency b MMSE <25 was considered as "cognitive impairment" c Tinetti score >24 was considered as "low fall risk" d GDS-15 score >4 was considered as "possible depression" Patients had on average 0.58 ± 0.92 STOPP-PIP in their list of prescriptions; range 0-10. The 41 % of affected patients had 1.43 ± 0.95 STOPP-PIP in their treatment.
The application of the Beers tool pointed out Beers-PIPs as drugs to avoid or to avoid in the presence of certain conditions in 32 % of the patients. The mean number of Beers-PIP in the treatment was 0.44 ± 0.79 per patient; range 0-6. In patients having at least one Beers-PIP, the average was 1.38 ± 0.80 per affected patient. In addition, Beers drugs that are labelled to be used with caution were found in 45 % of the patients.
Overall, 108 patients out of the 567 (19 %) had no PIP at all when considering START, STOPP and Beers tools. The most frequent PIPs are presented in Table 2. As far as underuse was concerned, the most frequent drug category using START was cardio-vascular (antiplatelet, statin, angiotensin-converting-enzyme inhibitors). The most frequent drug categories related to misuse or overuse were cardiovascular and psychotropic drugs (aspirin, benzodiazepines) and similar using STOPP and Beers. The prevalence of PIPs related benzodiazepine use with history of falls was less than 1 % (one patient). However, 19 % of the patients on benzodiazepines were at high fall risk according to their Tinetti score, and could therefore be assimilated to patients having PIPs.

Determinants of PIP
The results of the multivariate analysis are shown in Table 3 Forty-three PIPs cases (30 %) were rated of "major" importance, while 33 PIPs (23 %) were considered of moderate importance. The experts rated two PIPs as "minor". One START-PIP appeared to be "deleterious". Examples are provided in Table 4. Finally, the experts agreed that 40 PIPs (28 %) were "non-applicable" according to the individual context and even considered 19 of those cases (13 %) as actually appropriate prescribing.
The 40 "non applicable" cases could be divided into two categories. The first category encompassed 25 cases where the experts found nuances within the detailed full record of the patient, including the comprehensive geriatric assessment, which questioned the applicability of the criteria. In the second category (n = 15), the experts questioned the content validity of the criteria. Examples are provided in Table 5. Comparing the PIPs detected by the three tools, START-PIPs were the most frequently rated as "non applicable".

Summary
For the first time, this study described inappropriate prescribing and their determinants in a large representative sample of community-dwelling very old patients. In this population, the prevalence of PIPs was high. Potentially inappropriate omissions, detected by the START tool were more prevalent (59 % of the patients) than overuse of treatment (STOPP: 41 %; Beers' drugs "to avoid": 32 %). Drugs from the cardiovascular and neurologic systems were the most frequently involved in PIPs. Among medical, social and geriatric features, the CIRS was related to having START-and Beers-PIPs, place of residency was associated with Beers-PIPs and the functional dependence was determinant of having STOPP-PIPs.
The evaluation of the relevance of the criteria showed that a holistic approach of the patient changed the applicability of the criteria in 17 % of PIP cases. More importantly, in 10 % of cases, the recommendation to modify the drug regimen was not valid, and could even be considered as deleterious, which is not acceptable. Finally, the experts did not rate similarly the clinical importance of the criteria in 17 % of cases. This illustrates the subjectivity of the assessment of the patient's context and the variable importance acknowledged to inappropriate prescribing according to the evaluator.

Comparison with existing literature
The prevalence detected in our patients aged 80 and older (START : 59 %; STOPP: 41 %; Beers' drugs "to avoid": 32 %) did not substantially differ from the prevalence reported in the literature with populations including younger patients in whom the range of prevalence of START-PIP is 23 to 68 % [29][30][31][32], STOPP-PIP 18-60 % [30,[32][33][34][35][36][37], and Beers-PIP 12.5-42 % [38,39]. This observation is consistent with a study published in 2015 that compared the prevalence of START-PIPs, STOPP-PIPs and Beers-PIPs in patients aged between 75 and 84 years vs patients aged 85 and over, in the hospital setting [13]. The prevalence of PIPs was similar in both groups [13]. It should be noted that prevalence of PIPs varies greatly from studies, depending of the setting and the range of the criteria of the tools that were taken into account. Tinetti score >24 was considered as "low fall risk"; ≤ 18 was "high fall risk" Criterion: STOPP-PIP "Long-term longacting benzodiazepines".
Context: The patient has been taking 8mg prazepam every day for more than a month. She has low fall risk (Tinetti score 26/28 b ) but she has cognitive impairment (MMSE=18/30).
Context: The patient has no history of coronary, cerebral or peripheral vascular symptoms or occlusive event.
Context: The patient is on clomipramine for "depressive tendencies" according to the GP. The GDS-15 score is low (3/15 c ). Non pharmacologic or safer alternatives are available.
Minor clinical importance (n = 2) Modification of the treatment according to these criteria brings no benefit or minor benefit, depending on professional interpretation The criterion is not applicable to the individual context of the patient.

Context:
The patient had a single episode of suspected angina in the past, and he has asthma.
Criterion: START-PIP "Aspirin therapy in diabetes mellitus if coexisting major cardiovascular risk factors present".
Context: The patient is already on antivitamin K and he has no acute coronary disease.

Context:
The prescription includes a patch of nitroglycerin and tablets of isosorbide dinitrate. However, in his notes, the GP specifies that the patient uses the tablets "as needed" only.
Criterion: STOPP-PIP "Long-term use of NSAID for symptom relief of mild osteoarthritis".

Context:
The 83 year old patient has chronic knee pain despite the use of paracetamol. Unfortunately, his severe respiratory and cardiac status is a contra-indication to surgery and he is intolerant to alternatives to NSAID. He is on proton-pump inhibitor.
Context: This patient has cognitive impairment but also a long story of psychiatric disorders.
Criterion: Beers-PIP "Avoid benzodiazepines for the treatment of insomnia, agitation, or delirium".
Context: This patient received alprazolam to improve her sleep in a context of severe chronic anxiety.
Abbreviation: GDS-15 geriatric depression scale, GP general practitioner, MMSE mini mental state examination, NSAID non-steroidal anti-inflammatory drugs, NYHA New York Heart Association Functional Classification, PIP potentially inappropriate prescribing a MMSE<25 was considered as "cognitive impairment" b Tinetti score >24 was considered as "low fall risk" c GDS-15 score >4 was considered as "possible depression" In younger patients, STOPP-PIPS have been related to polypharmacy, age, institutionalisation, and increased comorbidity), while START-PIPs were variably related to age, female gender, and increased comorbidity [1,13,40]. Our results confirm the importance of the comorbidity burden in the risk of having PIPs.
Only a few previous studies looked at the clinical importance of PIPs detected by explicit tools in patients [14][15][16]. Our analysis on the subsample showed the same trends as previous studies on the Beers list [15,16] and STOPP [14]: a substantial proportion of PIPs were actually appropriate, the relevance varied according to the drug type and some criteria appeared less controversial than others (Table 5). Our review of the subsample of patients reinforces the idea that PIPs are actually only potential and that solely looking at the criteria is not sufficient to decide if the prescribing is inappropriate or not. A holistic approach to the patient challenges the PIPs detected by explicit screening tools, as illustrated in Table 5.

Strengths and limitations
This study goes one step further than previous observational studies on PIPs, by providing important findings about the validity of the use of explicit tools and highlighting how a holistic approach matters when reviewing the treatment.
This study presents some limitations. The data used to detect PIPs were not prospectively collected for the purpose of this analysis. Therefore, the quantity of PIPs might have been underestimated. For example, PIPs related to the history of falls, which were frequently reported in previous studies [41], were infrequent in our study (likely due to under-reporting of falls as a medical condition by GPs). However, we observed that a fifth of the patients on benzodiazepines were at high fall risk. Criteria related to delirium and dementia were expected to be infrequently encountered because these patients were excluded from the cohort.
The analysis on the clinical relevance of the criteria was only performed on a small subsample of PIPs. This assessment was designed to provide an insight but did not intend to comprehensively evaluate the content of the full criteria lists. Further studies should assess the actual inappropriateness of all the drugs listed on the tools in larger extent. However, this subsample allowed to discuss the most frequent PIPs and enabled us to identify several important points for discussion on the validity of the tools.

Implications for research and/or practice
Based on this analysis, including the examples detailed in Table 5, we suggest some important modifications to the tools to improve their validity and applicability ( Table 6). The main suggestion for the future use of screening tools in daily practice is that these tools can only be used in addition of an assessment of actual appropriateness of prescribing by clinicians with a good understanding of the patient global health situation and full access to the patient's history. In many ways, the GP appears as the foremost potential user of the tools. Indeed, the GP knows the patient the best, thanks to a long relationship and global vision of the patient Situations that question the content validity of the criteria: • START-PIP in patients already treated by suitable alternative medications e.g., "Proton pump inhibitor with severe gastroesophageal acid reflux disease" in a patient already on histamine H2-receptor antagonist.
• START-PIP "Warfarin in the presence of chronic atrial fibrillation" in patients with low stroke risk • START-PIP "Regular inhaled β2-agonist or anticholinergic agent for mild-to-moderate asthma or COPD" in a patient with asthma due to acid reflux • STOPP-PIP "Any duplicate drug class prescription" because insufficiently defined.
• Beers-PIPs mentioning that a medication should be avoided as "first-line therapy" because such a feature is often difficult to detect  Table 4   Table 6 Recommendations to improve the validity and applicability of explicit tools Recommendations to improve the validity of the criteria Recommendations to improve the applicability of the criteria • mention of contra-indications of the criteria • no contradictions between criteria • no overlap between criteria • precise range of application of the criteria (inclusion criteria) • mention of time to benefit [45,46] • clear definitions (conditions, diseases, drug categories) • monitoring tips • suggestions of alternatives (pharmacological and non-pharmacological) • mention of adaptation to functional and cognitive status, life-expectancy, and multimorbidity. medical, social and functional status. Very recently, an updated version of the STOPP&START tool was published [42]. Some of the criteria for which the validity was questioned in our study have been removed. However, we believe that most recommendations in Table 6 remain valid. Moreover, the message about the paramount importance of using the tools as part of a holistic approach is applicable to any explicit criteria list.
Our results therefore somewhat question the application of explicit screening tools to administrative databases. This approach, which was regularly performed in previous studies [34,43], is valuable to have a global insight into PIPs patterns and the most frequently encountered drugs. But the prevalence and frequencies should be interpreted with caution. Issues identified can only be deemed potentially inappropriate owing to the limited clinical information available in such databases. Application of explicit tools to large databases should be refined so as to take into account factors that decrease applicability of some criteria (e.g., contra-indications or presence of alternative drugs for START criteria).
Perspectives for future research are provided by this baseline analysis of the BELFRAIL cohort. Longitudinal analysis should compare the incidence of geriatric adverse events (death, hospital admissions, falls, adverse drug events) and costs of care in patients having or not confirmed PIPs at baseline. Additional qualitative research could enlighten some of the barriers to implement screening tools. Furthermore, the viewpoint of the patient on the appropriateness of his own treatment should be explored.

Conclusions
Our observations highlight the high prevalence of PIPs in very old patients in primary care. The medication review should be part of a comprehensive process to optimize pharmacotherapy. Explicit tools help to revise the treatment but will never replace good clinical judgement [44]. The general practitioner plays a key-role in the management of chronic drug treatment and is therefore potentially in the best position to collaborate and to apply the explicit tool. A good understanding of the patients' medical, functional and social context is crucial to assess the actual appropriateness of drug treatment.