Expert-based medication reviews to reduce polypharmacy in older patients in primary care: a northern-Italian cluster-randomised controlled trial

Background Evidence regarding clinically relevant effects of interventions aiming at reducing polypharmacy is weak, especially for the primary care setting. This study was initiated with the objective to achieve clinical benefits for older patients (aged 75+) by means of evidence-based reduction of polypharmacy (defined as ≥8 prescribed drugs) and inappropriate prescribing in general practice. Methods The cluster-randomised controlled trial involved general practitioners and patients in a northern-Italian region. The intervention consisted of a review of patient’s medication regimens by three experts who gave specific recommendations for drug discontinuation. Main outcome measures were non-elective hospital admissions or death within 24 months (composite primary endpoint). Secondary outcomes were drug numbers, hospital admissions, mortality, falls, fractures, quality of life, affective status, cognitive function. Results Twenty-two GPs/307 patients participated in the intervention group, 21 GPs/272 patients in the control group. One hundred twenty-five patients (40.7%) experienced the primary outcome in the intervention group, 87 patients (32.0%) in the control group. The adjusted rates of occurrence of the primary outcome did not differ significantly between the study groups (intention-to-treat analysis: adjusted odds ratio 1.46, 95%CI 0.99–2.18, p = 0.06; per-protocol analysis: adjusted OR 1.33, 95%CI 0.87–2.04, p = 0.2). Hospitalisations as single endpoint occurred more frequently in the intervention group according to the unadjusted analysis (OR 1.61, 95%CI 1.03–2.51, p = 0.04) but not in the adjusted analysis (OR 1.39, 95%CI 0.95–2.03, p = 0.09). Falls occurred less frequently in the intervention group (adjusted OR 0.55, 95%CI 0.31–0.98; p = 0.04). No significant differences were found regarding the other outcomes. Definitive discontinuation was obtained for 67 (16.0%) of 419 drugs rated as inappropriate. About 6% of the prescribed drugs were PIMs. Conclusions No conclusive effects were found regarding mortality and non-elective hospitalisations as composite respectively single endpoints. Falls were significantly reduced in the intervention group, although definitive discontinuation was achieved for only one out of six inappropriate drugs. These results indicate that (1) even a modest reduction of inappropriate medications may entail positive clinical effects, and that (2) focusing on evidence-based new drug prescriptions and prevention of polypharmacy may be more effective than deprescribing. Trial registration Current Controlled Trials (ID ISRCTN: 38449870), date: 11/09/2013. Supplementary Information The online version contains supplementary material available at 10.1186/s12877-021-02612-0.


Background
Prescription and monitoring of drug therapy in older patients is challenging due to age-related physiological changes and frequent concomitant conditions [1]. Moreover, evidence regarding drug therapy in older adults is scarce or shows major research deficits: guidelines are often based on expert consensus or do not consider common comorbidities, e.g. cognitive dysfunction [2]. Benefits of therapies are usually demonstrated by trials involving younger and healthier persons and results might not be applicable to older-aged multimorbid patients [3] whose life expectancies are sometimes shorter than the time required to gain benefits from pharmacological treatments [4]. Clinical guidelines usually address single diseases and often require the use of several drugs per disease. Moreover, following new guidelines with lower thresholds for starting pharmacological treatment and the presence of new medications may induce physicians to prescribe more drugs [5]. Thus, in the treatment of patients with multiple chronic conditions, guideline adherence (as demanded e.g. by quality programmes) inevitably leads to the use of multiple medications [6].
As a consequence, polypharmacy has become a major health concern in older adults [7][8][9] and has been well documented to entail an increased risk of potentially inappropriate medication (PIM) [10,11], adverse drug events (ADEs) [12], medication nonadherence [13] and increased healthcare costs [14]. Polypharmacy was also demonstrated to be an independent predictor of nursing home admission, malnutrition, fractures, impaired mobility [15] and to be associated with higher rates of preventable hospitalisations [16,17], hospital re-admissions within short periods [18] and increased mortality [19].
In primary care, studies showed up to 40% of patients aged ≥65 years [20] respectively 54% of patients aged ≥70 years [21] to be affected from polypharmacy. In the long-term inpatient setting, the prevalence of polypharmacy is even higher [22].
Thus, the evidence clearly indicates that there is a strong need to reduce polypharmacy and inappropriate prescribing. Several strategies in this regard have been investigated in various settings: medication reviews performed by pharmacists and/or employing other healthcare professionals [23], multidisciplinary case conferences [24], pharmacist consultations [25], educational programmes [26], computerised support systems and multifaceted approaches [25]. The interventions seemed to be effective in reducing inappropriate prescribing, however, the overall quality of the evidence was considered to be low and convincing effects on clinically relevant outcomes such as hospitalisations or mortality could not be demonstrated [23].
Nevertheless, given the high prevalence of polypharmacy and its potentially harmful impact on health outcomes, it seems plausible that reducing polypharmacy will influence clinical outcomes favourably. In the geriatric inpatient setting, a promising strategy has been investigated by Garfinkel et al. which consisted of drug regimens' evaluation and discontinuation, change or dose reduction of medications by using an ad hocdeveloped algorithm. The study showed a reduction of mortality and acute hospitalisations, however, the study design was non-randomised and the sample was small [4]. Another five-step process to reduce polypharmacy ('deprescribing') seemed to be beneficial [27], but evidence is inconsistent [28].
As medical care of chronically ill patients is mainly assured by general practitioners (GPs) [29] and GPs assume a crucial role in prescription and monitoring of drug therapies [30], well-designed interventions to reduce polypharmacy and inappropriate prescribing are strongly required for general practice.
Therefore, the present cluster-randomised controlled trial (RCT) 'PRIMA' (Polypharmacy in chronic diseases-Reduction of Inappropriate Medication and Adverse drug events in older populations) [31] was initiated with the aim to study if the prudent, evidence-based intervention to reduce polypharmacy leads to benefits for multimorbid older patients in the primary care setting.
The intervention comprised an appraisal of each patient's medication regimen by three experts who gave specific recommendations for drug discontinuation to Conclusions: No conclusive effects were found regarding mortality and non-elective hospitalisations as composite respectively single endpoints. Falls were significantly reduced in the intervention group, although definitive discontinuation was achieved for only one out of six inappropriate drugs. These results indicate that (1) even a modest reduction of inappropriate medications may entail positive clinical effects, and that (2) focusing on evidence-based new drug prescriptions and prevention of polypharmacy may be more effective than deprescribing.

Methods
The CONSORT guidelines for randomised trials of the EQUATOR network (Enhancing the QUAlity and Transparency Of health Research) were followed in the preparation of the study report.

Study design, population and setting
The cluster-RCT (12/2012-11/2016) was conducted in the primary care setting and involved GPs and olderaged community-living patients in the province of Bolzano (Italy). The observation period was planned to be 24 months.

Calculation of sample size
The calculation was based on the results of the abovementioned study by Garfinkel et al. [4] who used a geriatric-palliative algorithm for prudent reduction of polypharmacy and established the annual rate of acute hospitalisation as primary endpoint. The study observed an absolute risk reduction of 18% regarding the primary endpoint (corresponding relative risk reduction: 60%).
As the effects in non-randomised trials are usually overestimated, the setting of the Garfinkel study was a geriatric clinic and our primary endpoint was different (non-elective hospital admission or death within 24 months), we presumed to achieve a relative risk reduction of 35-40%. We expected to recruit healthier patients and therefore estimated an annual rate of 60% of the rate observed by Garfinkel thus achieving an estimated annual rate of hospitalisations/death of 12.5% for the control group, corresponding to 25% in two years. For the intervention group, a relative risk reduction of 37% was expected (absolute risk reduction 9.3%) corresponding to an event rate of 15.8% in two years.
Assuming α = 0.05 and a power of 1-β = 0.8, we calculated a sample size of 602 patients (301 per arm). After considering a drop-out rate of 10%, the necessary sample size was 666 patients (333 per arm). Assuming a recruitment of 10-15 patients per GP office we aimed at recruiting 44-67 GPs.
As less GPs were recruited than a priori expected (43 GPs), the sample size was re-calculated assuming a larger risk reduction from 25 to 15% instead of 15.8%, achieving thus a final necessary sample size of 543 patients (272 per arm, 13 patients per GP).

Recruitment of GPs
Recruitment was conducted between 01/2014 and 10/2014. The research team informed and invited to participate all 270 active GPs listed in the local Chamber of Physicians by email and phone.
The participating physicians were informed about the study in an initial onsite meeting and were instructed regarding the collection of data, the electronic generation of the case report forms (CRFs, see below) and the procedure of the intervention. The GPs also received a written handout and video tutorials as instruction. Moreover, throughout the whole study period, the GPs were personally supported by the research team regarding all study procedures by means of telephonic contacts and personal visits in the GP offices.

Recruitment of patients
The participating GPs identified consecutive eligible patients and invited them to participate during routine visits in the GP office.
All participating GPs and patients gave written informed consent. The GPs were remunerated for participation.

Inclusion criteria (patients)
• Age ≥ 75 years • On therapy with ≥8 prescribed active agents. The cutoff of ≥8 drugs was chosen as the development of our study concept and sample size calculation was based on the results of a former study realised by Garfinkel et al. [4] (see above) which used a geriatric-palliative algorithm as deprescribing intervention and found a consumption of > 7 drugs among the included olderaged participants. This was the basis for the cut-off chosen for our study by defining as inclusion-criterion the corresponding number of > 7 drugs, i.e. ≥8 drugs. • Combination drugs were counted according to the number of active agents. PRN-medications and OTC-drugs were excluded as the analysis was limited to chronic therapies and to prescribed medications which were the only electronically extractable drugs from the GPs' electronic health records (EHRs) during data collection.

Exclusion criteria (patients)
Life expectancy less than three months according to the GP's judgement, advanced stage of cancer, radiation/ 1 In contrast to the European multicenter trial PRIMA-eDS [30], which used the same inclusion criteria and outcomes as in our study and investigated the impact of an electronic decision support system providing a comprehensive medication review, the here presented PRIMA study was limited to a northern Italian region and applied expert-based medication reviews as intervention. The data generated by the independent Italian PRIMA study were not part of the PRIMA-eDS trial.
chemotherapy, severe cognitive impairment with inability to give informed consent.

Cluster-randomisation
The participating GPs were randomised by the project team using computerised sequence generation. To avoid contamination risks, units of randomisation were the GPs; thus, all included patients of one participating GP were either participating to the intervention group (IG) or to the control group (CG). The participating GPs were stratified according to gender and location (urban or rural area) to avoid over-representation of a specific feature in one group. At the time of baseline data collection, neither GPs nor patients knew if they would be part of the intervention or control group (allocation concealment).
Blinding of participating GPs and patients was not possible due to the nature of the intervention.

Pre-review
The medication plans of all participating patients, also of the control group, were pre-reviewed by a GP of the project team who was not included as study participant and by a student of pharmacology. They checked the medication plans for PIMs using the 2012 Beers criteria (Italian Version) [32,33] and for drug-drug interactions (DDIs) using the Lexicomp/UpToDate ® database [34]; only potentially severe DDIs were considered (categories D = consider drug modification and X = avoid combination).
The pre-review was carried out (a) to describe the whole study sample in terms of prevalence of PIMs, DDIs and associated factors [31], and (b) for the intervention group to provide the experts with information serving as a basis for the elaboration of the deprescribing recommendations (see below). The results of the pre-review were not communicated to the participating GPs at any time.

Intervention
Subsequently, every drug regimen was assessed by a specialist in internal medicine, a clinical pharmacologist, and an evidence-based medicine (EBM) expert in due consideration of current best evidence regarding pharmacological treatment in older patients, of disease-specific guidelines, and by considering the results of the pre-review, i.e. presence of PIM and/or severe DDIs. The experts were given all patient information documented on the CRFs (see below). As it was the scope of the intervention that each expert conducted the assessments using his profession-specific expertise, the project team was not informed in detail regarding additional instruments used by the experts (e.g., electronic decision aids).
Based on their medication review, the experts elaborated recommendations for discontinuation of inappropriate drugs and sent them to the research team. There was no prioritisation of drugs to be discontinued.
If at least two experts concorded regarding a specific recommendation, the respective recommendation and a brief explanation was forwarded to the respective GP by the research team. The GPs were invited to reflect on the recommendations in a shared decision-making process with the patient. All final decisions regarding continuation/discontinuation of drugs remained in the responsibility of the GP and the patient. The GPs informed the project team if they adopted the recommendations and gave an explanation statement in case of non-adherence. After three respectively six months, the research team contacted the GPs to recall the recommendations received and to discuss any difficulties regarding application.
The intervention took place between baseline data collection (T 0 ) and the first follow-up (FU1) (Fig.1).

Control
The patients in the control group received usual care. This included potential use of guidelines by their GPs and drug changes during routine care, but they did not receive the structured medication review with recommendations by the three experts.
The GPs in the control group recorded data of their included patients concordantly with the intervention group.

Outcomes
The primary endpoint was a composite binary outcome of all-cause mortality or unplanned hospitalisations. This was chosen to allow that all patients could be included in the analysis, as the two main outcomes may compete. Additionally, all-cause mortality and non-elective hospitalisations were analysed as single secondary outcomes. All analysed primary and secondary endpoints and the respective measuring methods are described in Table 1.
GPs General practitioners, CRF Case Report Form, T 0 Baseline data collection, FU Follow-up study visit, T 2 Final study visit, EQ-5D 5-Item questionnaire measuring health-related quality of life, VAS Visual analogue scale, QoL Quality of life.

Data collection
After obtaining informed consent, every participating patient was documented via a structured CRF. The CRF was provided in electronic form for those GPs who used the EHR Millewin ® (n = 39) which allowed the electronic integration of the CRF. For this purpose, an add-on module was programmed which filled in automatically all required patient data available in the EHR. The GPs checked these electronically generated CRFs and completed missing data manually (e.g., the results of the patient questionnaires). The electronic CRFs were then sent to the research team as an email attachment via the add-on module.
The add-on module supported the GPs also during the recruitment and intervention period. Every time the GP opened a patient health record, e.g. during visits in the GP office, consultations by phone or in case of drug prescriptions, alerts and reminders popped up regarding eligibility of a patient (recruitment phase) or regarding missing actions/documentations (intervention phase).
For the remaining n = 4 GPs using other EHRs, the CRFs were provided on paper. The GPs completed the paper CRFs manually and sent them via fax or email to the research group.
The GPs of both the intervention and the control group recorded at baseline (T 0 , prior to the intervention) the following parameters in the CRF: • Patients' age, sex, diagnoses (ICD-9-coded), current drug prescriptions (international non-proprietary names, ATC-coded), daily dosage of the prescribed drugs • Biometric and laboratory parameters: height, weight, BMI, blood pressure, renal functional parameters, potassium, cardiac frequency, haemoglobin, HbA1c, hepatic enzymes, erythrocyte sedimentation rate, brain natriuretic peptide (BNP/NT-proBNP), International Normalised Ratio (INR) 2 • Symptoms: Nausea, vertigo, pain, obstipation, diarrhoea, dyspnoea, angina pectoris, weight loss > 2 kg during the last month; falls requiring medical treatment, fractures, anaemia, gastrointestinal bleeding, cardiovascular problems, hospitalisation during the last 12 months; these parameters were collected for describing the study sample at baseline respectively for patient monitoring throughout the study. • Results of the questionnaires: EQ-5D-5L, 5-GDS, 6-CIT (Table 1)

Non-elective hospitalisations or all-cause mortality
Recorded by the GP in the case report form (CRF) at FU1, FU2, T 2 (final data analysis) Hospitalisations: number of episodes (referral to any acute care facility, either emergency department or hospital; elicited by the GP, by any other physician or by the patient himself ) Death: number of patients

Outcome Measuring method Description
All-cause mortality All events were recorded by the GPs in the CRF at T 0 , FU1, FU2, T 2

Number of patients
Non-elective hospital admissions All events were recorded by the GPs in the CRF at T 0 , FU1, FU2, T 2 Number of episodes (see primary endpoint)

Falls
All events were recorded by the GPs in the CRF at T 0 , FU1, FU2, T 2 Number of falls requiring medical care

Fractures
All events were recorded by the GPs in the CRF at T 0 , FU1, FU2, T 2

Number of drug prescriptions
Recorded by the GPs in the CRF at T 0 , FU1, FU2, T 2 Total number of prescribed drugs Health-related quality of life EQ-5D-5L and EQ-VAS [35] The questionnaire was handed out to the patients by the GPs at T 0 and T 2 (prior to / after the intervention) -Five items addressing health-related QoL (mobility, self-care, usual activities, anxiety / depression, pain) and resulting in an index value (maximum = 1 = full health) -Visual analogue scale EQ-VAS (range 0-100)

Affective status 5-Item Geriatric Depression Scale (5-GDS) [36]
The questionnaire was handed out to the patients by GPs at T 0 and T 2 -Five items addressing satisfaction, tediousness, helplessness, social withdrawal, self-esteem -≥2 points → presence of depressive symptoms (range 0-5)

Cognitive performance 6-Item Cognitive Impairment Test (6-CIT) [37]
The questionnaire was handed out to the patients by GPs at T 0 and T 2 -Six items evaluating cognitive function -≥10 points → significant cognitive impairment (range 0-28) At the planned study visits after 8, 16 and 24 months (follow-up 1 = FU1, follow-up 2 = FU2 and final examination = T 2 ), the same data were collected (except EQ-5D-5L/5-GDS/6-CIT: only at T 0 and T 2 ), and additionally the following events/parameters: • Death • Non-elective hospital admissions • Only in the intervention group: the number and types of experts' recommendations and the reactions of the GPs (adoption or non-adoption of the recommendations, explanation in case of non-adoption, number and kinds of stopped drugs, not discontinued drugs despite the recommendation of discontinuation, re-prescription of a stopped medication).
All outcome measures (Table 1) were recorded by the GPs when they occurred or at least at the planned followup study visits. The GPs were also free to conduct additional visits according to their discretion. The GPs were informed about the events of interest (death, hospital admission, falls, fractures) by the patients' anamnesis during the study visits or other patient contacts within the study period, or by information of the patients' relatives. Also discharge letters from hospitals and/or contacts with hospital physicians or other specialists served as source of information for the GPs regarding patientrelated events.
All patient data were pseudonymised by the GPs (using anonymous patient numbers) and afterwards exported by the research team for statistical analysis.

Monitoring
All CRFs were controlled by the project team. In case of incompleteness, the GPs were contacted to retrieve missing data.
As the intervention included possible discontinuation of (however not evidence-based) drugs, appearance of new symptoms was monitored thoroughly by the GPs. The GPs were informed that any drug discontinued in the study could be re-prescribed in case of symptom recurrence.
No explicit stopping rules were defined; no respective concerns to stop the study occurred throughout the study period. Categorical variables were summarised as absolute and relative frequencies, while numerical variables as median and interquartile range (IQR).

Data analysis
Mann-Whitney-U tests and Fisher exact tests were used for unadjusted comparison of (continuous and categorical, respectively) baseline characteristics and secondary outcomes between the study groups.
Primary and secondary binary outcomes were also compared between groups in both uni-and multivariable settings using logistic regression (binary outcomes) and Cox regression (time-to-event outcomes) with cluster-robust standard errors of estimates which take into account intragroup cluster correlation [38] (Table 3). In the multivariable models the following baseline variables were included for adjustment: sex, age, number of conditions, number of symptoms (within 1 month before T 0 ), number of falls, number of fractures and number of hospitalisations (each within 12 months before T 0 ).
The composite primary endpoint was analysed according to intention-to-treat and per-protocol principles, the secondary outcomes were analysed as per-protocol (see below).
All tests were two-sided; a significance level of p < 0.05 was used throughout.
Baseline demographical data of GPs/patients, diagnoses and medication-related data had no missing values.
Laboratory values were not available for all patients; in case of missing values, a listwise deletion was applied, i.e. individuals with missing data were excluded from analysis of laboratory values. When T 2 data were missing due to death or withdrawal, we used the last recorded outcome value.
Values obtained from EQ-5D-5L were converted into the EQ-5D index (single value per patient) by using the German EQ-5D-5L Crosswalk Value Set [39] as no country-specific value set was available for Italy [40] and Germany most closely approximates to the investigated northern Italian region 3 [41].

Study participants
Of 270 invited GPs, n = 43 (15.9%) participated. The 43 GPs had 71,014 enrolled patients overall and 8015 patients aged ≥75 years. Out of these, 1075 patients (13.4%) were on therapy with ≥8 drugs and thus eligible.
The 43 GPs recruited n = 579 patients (53.9% of the eligible patients). After cluster-randomisation, 22 GPs and 307 patients were allocated to the IG, 21 GPs and 272 patients to the CG. 94 patients (IG: 57, CG: 37) were lost to follow-up because of death or withdrawal (Fig.1).
Most baseline characteristics were well-balanced between IG and CG (age, gender, number of chronic conditions, laboratory parameters, health-related quality of life, cognitive function, affective status, number of preinterventional fractures, frequency of PIMs and DDIs). Pre-interventional hospitalisations, falls and symptoms were significantly more frequent in the IG while the median number of drugs was significantly higher in the CG ( Table 2).

Primary endpoint and secondary endpoints
For the intention-to-treat analysis (ITT) of the primary endpoint, participants who were lost to follow-up were included as having reached the outcome ( Table 3).
The secondary outcomes were analysed as per-protocol (PP) by including patients with outcome measures up to death or T 2 , excluding those participants who were lost to follow-up due to pre-interventional death or withdrawal (IG: n = 26, CG: n = 15; Fig.1).
Primary outcome: In the IG, 125 of 307 patients (40.7%) experienced the primary outcome of at least one non-elective hospitalisation or death. Moreover, 26 patients (8.5%) were lost to follow-up due to withdrawal or pre-interventional death. In the CG, 87 of 272 patients (32.0%) experienced non-elective hospital admissions or death and 15 patients (5.5%) were lost to follow-up.
In both the ITT and PP analysis, the adjusted rates of occurrence of the primary outcome in the CG and IG groups did not differ significantly (ITT: adjusted OR 1.46, 95%CI 0.99-2.18, p = 0.06; PP: adjusted OR 1.33, 95%CI 0.87-2.04, p = 0.2).
Secondary outcomes: Deaths and hospitalisations as single endpoints occurred tendentially more often in the IG, hospitalisations were significantly more frequent in the IG according to the unadjusted analysis (unadjusted OR 1.61, 95%CI 1.03-2.51, p = 0.04) but not in the adjusted analysis (adjusted OR 1.39, 95%CI 0.95-2.03, p = 0.09). Regarding falls, a statistically significant risk reduction favouring the IG was observed in the adjusted analysis (adjusted OR 0.55, 95%CI 0.31-0.98; p = 0.04).
No statistically significant difference between the treatment groups was found regarding fractures, number of drugs, EQ-5D/EQ-VAS, 5-GDS and 6-CIT scores ( Table 3).

Results of the intervention
For 15.8% of all drug prescriptions, at least two experts agreed by rating them as inappropriate; thus, for these prescriptions, recommendations were elaborated and sent to the respective GP. The GPs received recommendations of drug discontinuation for 76.5% of the included patients. The EbM-expert rated 16.6% of all prescriptions as inappropriate, the clinical pharmacologist 14.1%, and the internist 13.6%. Pharmacologist and EbM-expert showed the highest agreement upon recommendations (79.7%), pharmacologist and internist the lowest (66.6%). The same three drug classes (anxiolytics/hypnotics, PPIs and beta-blockers) were valued most frequently as inappropriate by all three experts ( Table 4).
The absence of a clear indication was by far the most frequent rationale for the recommendation to stop a drug (58.7%), followed by contraindication in older age (15%), contraindication as first-line treatment (13.6%) or as long-term therapy (6.4%), and presence of conditionspecific contraindications (2.1%).
The GPs discontinued 24.3% of the recommended drugs in 37.2% of the concerned patients ( Table 5).
For 55.8% of the 317 not discontinued inappropriate drugs, the GPs provided an explanation why they did not

Endpoints
This northern-Italian RCT investigated the effect of medication reviews and recommendations provided by three experts with different professional background on patient-relevant outcomes in older-aged general practice patients on polypharmacy. The study found a high prevalence of polypharmacy, PIMs and DDIs, as described in the publication of epidemiological baseline data [31]. The composite primary outcome of non-elective hospital admissions or death was experienced significantly more often in the IG in the unadjusted analysis; yet, as significance disappeared after adjustment, which was also noted for hospitalisations as single secondary endpoint, this phenomenon seems to be strongly related to the higher occurrence of hospital admissions in the IG within the pre-interventional period. Also, the frequency of pre-interventional falls and symptoms was significantly higher in the IG. Thus, patients of the IG seemed to be in less favourable physical preconditions than CG patients. This phenomenon could have been entailed by the cluster randomisation (e.g., GPs of the IG could have systematically recruited more clinically impaired patients); however, the cluster effects were considered in the outcome analysis and did not significantly influence the results.
Hospitalisation rates (as secondary endpoint) remained higher in the IG than in the CG at T 2 , yet, the difference between the study groups was reduced compared to baseline. For both groups, the descriptive within-group analysis showed an increase of hospitalisations up to T 2 , whereby the increase in the CG was more pronounced and nearly doubled (Supplementary Tab.III). Therefore, although the intervention was not able to actually reduce mortality and hospitalisation rates, it may cautiously be interpreted as having demonstrated a positive impact in terms of a slowed increase of hospitalisations in a frail older-aged population with natural tendency to deterioration of clinical and physiological functions.
No significant difference was detected regarding mortality as single endpoint and fractures. Both outcomes occurred tendentially but not significantly more often in the IG, probably because of the higher clinical impairment of IG patients at baseline.
The assessed patient-reported outcomes did not significantly differ between IG and CG either. In both treatment groups, quality of life and affective status showed a tendency to decrease over time, most probably due to the natural functional decline in older-aged patients. Interestingly, the cognitive function of the assessed participants remained stable throughout the observation period which concords to the fact that only few patients with diagnosticated dementia participated in the study [31] and severe cognitive impairment was an exclusion criterium.
A positive result was by contrast found regarding falls: although being more than twice as frequent in the IG at baseline, a significant risk reduction favouring the IG was observed at the end of the study. Therefore, the intervention seemed to have significantly reduced falls in the investigated older-aged population. On the other hand, as a rather small part of the medications rated as inappropriate was actually withdrawn (see below), the extent of the real impact of the intervention on the measured i Mann-Whitney U test, ii Fisher exact test § The following symptoms were considered: nausea, vertigo, pain, obstipation, diarrhoea, dyspnoea, angina pectoris, weight loss ≥2 kg; full list: Supplementary Tab.I § § Drug-drug interactions: category D = consider drug modification, category X = avoid combination [34] IQR Interquartile range, EQ-5D 5-Item questionnaire measuring health-related quality of life, VAS Visual analogue scale, 5-GDS 5-Item Geriatric Depression Scale, 6-CIT 6-Item Cognitive Impairment Test, BMI Body mass index, PIMs Potentially inappropriate drugs, DDIs Drug-drug interactions    outcomes should not be overestimated. Yet, our results may suggest that even modest reductions of inappropriate medications are able to entail clinical benefits, and to do so without negative impact on measured patientrelated outcomes. This was confirmed also by the European multicenter trial PRIMA-eDS [42] and points to safety of deprescribing [27,43]. True improvements of patient-related outcomes, especially mortality and hospital admissions, may be difficult to be achieved in olderaged multimorbid populations with natural tendency to functional deterioration [22]; therefore, besides from real measurable improvements, also stabilisation of clinical outcomes could be considered a positive result. Moreover, the medical impact of reduced falls should not be underestimated.
Although it was significantly higher in the CG at baseline, the median number of drugs did not differ between the study groups at T 2 . In both groups, the number of drugs decreased over time while the median reduction was tendentially higher in the CG. Thus, in contrast to the findings of the PRIMA-eDS trial [42], the intervention in our study did not have a clear impact on the number of prescriptions. This is not surprising as discontinuation of inappropriate drugs was carried out only in 37% of concerned patients and only 16% of the drugs recommended to discontinue were definitively withdrawn by the GPs; a notable reduction of the overall number of drugs could therefore not be expected. On the other hand, physicians of the CG could have changed their prescribing behaviour as well due to the awareness of participating in a study aiming at reducing inappropriate polypharmacy (study effect); this might have contributed to the even higher reduction of drug prescriptions in the CG.
In the available literature, a persisting paucity is noted of studies investigating the reduction of polypharmacy in daily practice [11]; this applies especially to high-grade evidence and proven effects on patient-relevant outcomes. A British RCT with a comparable intervention in care homes largely confirmed our results: the study found a significant reduction in number of falls, but no change in number of drugs, hospitalisations, mortality, cognitive function and activities of daily living [44]. The PRIMA-eDS study found no conclusive evidence for the reduction of mortality, non-elective hospitalisations, falls, fractures or improvements in quality of life (SF-12 physical and mental component scores) while the number of drugs was significantly reduced without negative impact on patient outcomes [42].
Also other studies in the primary care setting aiming at decreasing inappropriate polypharmacy achieved significant reductions of drug numbers [45][46][47][48]. However, a recent update of a Cochrane review [23] found no clear evidence that the assessed interventions were able to reduce the number of inappropriate prescriptions, hospital admissions, medication-related problems, or to enhance quality of life [49]. These results confirmed those derived from a former systematic review and meta-analysis [50]. Positive impacts of deprescribing interventions on all-cause mortality were found for non-randomised studies [4], but convincing evidence from randomised trials is lacking [51].

Intervention
15.8% of all drug prescriptions in our sample were valued as inappropriate by at least two experts (median: one per patient). This number appears rather modest, however, more than three quarters of the patients were treated with at least one inappropriate drug and nearly one fifth received three or more inappropriate drugs. The EbMexpert valued the highest proportion of drugs as inappropriate (16.6%), while the internist who was the expert most closely related to real practice rated the lowest proportion of drugs as inappropriate (13.6%).
In relation to their prescribing frequency, the most concerned drug classes in our cohort were anxiolytics/ hypnotics, alpha-blockers, antiarrhythmics, NSAIDs/ COX-2-inhibitors, PPIs and antidepressants/antipsychotics. Among these, antidepressants/antipsychotics, PPIs and anxiolytics were the most difficult to discontinue whereas the largest potential of deprescribing was

Concordance between experts
Pharm -Int Pharm -EbM EbM -Int Patients where two experts fully agreed: n (%) 187 (66.6%) 224 (79.7%) 207 (73.7%) § After exclusion of pre-interventional deaths and withdrawals, the intervention was conducted on 281 patients (Fig.1)  observed for Allopurinol and NSAIDs. As previous literature shows, NSAIDs belong to those drug classes causing the majority of drug-related hospital admissions [52]. Thus, a careful consideration of their risk and benefit may contribute to avoiding preventable hospitalisations. Yet, although NSAIDs were among the most successfully discontinued inappropriate medications in our sample, they were withdrawn in only 43% of those cases where discontinuation was recommended.
In total, 24.3% of the recommended drugs were stopped by the GPs. Of these, a third was restarted due to re-occurrence of conditions or symptoms; this concerned mainly antidepressants, PPIs, NSAIDs, benzodiazepines and beta-blockers. Thus, in total, effective withdrawal was obtained only for 16% of the recommended drugs. A narrative review found lower general proportions of patients who needed to restart discontinued drugs (2-18%) while the success rates of definitive discontinuation differed largely across drug classes (14-64% for PPIs, 25-85% for benzodiazepines) [28]. A non-controlled pre-post study involving communitydwelling older adults found that 82% of inappropriate drugs were withdrawn (benzodiazepines even almost 100%, PPIs 75%) and only 2% of the stopped drugs had to be re-administered. These numbers indicate a largely higher discontinuation rate than in our study, however, the study sample was small [53].
In general, the fact that many of the recommendations were not adopted by the GPs and only less than a fifth of the inappropriate medications was definitively discontinued make a conclusive statement regarding the effect of the intervention difficult. Other studies achieved higher acceptance of experts' recommendations by the GPs (44-58%) [44,48], however, also these numbers indicate that the implementation of such interventions meets significant barriers. This is a relevant result itself which poses the question why it is so difficult to discontinue drug therapies in patients with polypharmacy and which factors impede efficient deprescribing.
In our cohort, the most prevalent reason for recommending discontinuation by the experts was missing indication; on the other hand, most of those GPs who gave a justification for non-adherence to the experts' recommendations reported that a specific indication was given. These contradicting points of view may have intrinsically lowered the potential for deprescribing; however, the high baseline prevalence of PIMs (46.3%) and major DDIs (66.1%) underpin the a priori-necessity of deprescribing. Besides from true missing indications or symptoms/conditions falsely interpreted by GPs as correct indications, also other scenarios could have played a role: e.g. if a GP was aware of a condition and had treated it correctly but not listed the respective diagnosis in the EHR [54]. This would represent rather a problem of thorough documentation which is however an important precondition for high-quality therapy, especially in case of changing physicians or care providers. In this way, although this was not an explicit trial objective, the study could have contributed to enhance physicians' awareness towards consistent documentation.
Other frequent reasons for not discontinuing medications were prescriptions by specialists and patient's refusal; these were identified also by previous studies as major barriers to deprescribing [28].
In general, the literature distinguishes three types of factors which hinder deprescribing. Physician-related barriers comprise lack of knowledge [55], low awareness regarding identification of inappropriate drugs, inertia (failure to act despite of awareness), or low perceived self-efficacy (e.g. GPs not 'daring' to stop a medication initiated by a specialist) [28]. System-related barriers are lack of resources and time, multiple care providers with poor collaboration among different care levels, lack of guidelines for older multimorbid patients, and missing financial incentives for GPs addressing polypharmacy [28,56]. Although studies revealed willingness to deprescribing among patients [57,58], also patient-related barriers were identified: convictions regarding necessity of drugs [28,59], satisfaction with the current therapy [57], fears of health deterioration [28,57,58], free prescriptions, older age, and patients' lower educational level [56].
A remarkable impact towards deprescribing was attributed to the GPs' recommendation to stop a drug, the possibility of discussing doubts with the GP [28], a good patient-physician-relationship, and the feeling that deprescribing would be safe [57]. Moreover, multidisciplinary approaches [56], guidelines for deprescribing (e.g. a deprescribing algorithm for PPIs is available) [60], and appropriate information of patients regarding risk and benefit of stopping drugs [56] were mentioned as facilitators to deprescribing.
For improving the success of deprescribing initiatives in daily practice, these findings indicate the need of enhancing physicians' awareness towards inappropriateness of drugs and deprescribing, of providing appropriate time and financial resources for enabling the physicians to conduct effective and satisfying deprescribing conversations with their polypharmacy patients, of strengthening the GP-patient-relationship and the physicians' skills regarding shared decision-making, and of well-designed patient guidelines to enhance patient knowledge [58,61]. In our study, although GPs received tailored supervision throughout the study period, patients were not directly approached e.g. by educational initiatives. Moreover, also GPs were not explicitly trained towards polypharmacy and deprescribing. Perhaps a more active inclusion of patients in the intervention and an additional pre-interventional training of the GPs could have entailed a more effective implementation.
Our results also support previous conclusions [7,62] that prevention of polypharmacy may be more successful than an afterwards deprescribing of drugs which patients (and physicians) are used to. Thus, future interventions should additionally focus on new prescriptions. In daily practice, besides from medication reviews performed by physicians, pharmacists or multidisciplinary teams, also electronic tools providing decision support in real time may be useful (and even more practicable) for this purpose [42].

Strengths and limitations
A strength of the study is that we enrolled patients aged ≥75 years, as the older age groups are less studied up to now although being the most vulnerable cohort of patients [5].
A further strength is the multidisciplinary approach, i.e. the involvement of experts from three different fields of specialisation with the need of concordance of at least two experts. The experts' recommendations were intended as an aid to the shared decision-making process between physicians and patients, not to replace clinical judgement and individual patient counselling.
Moreover, the close involvement of GPs and patients in the intervention can be considered a strength [28] as well as the integration of the intervention in daily practice; however, at the same time, this made its implementation challenging and probably met several barriers which we were not able to fully identify neither to confront. Deprescribing addressed only withdrawal of drugs; a more sensitive approach could be achieved by also recommending dose reduction, safer alternative drugs or starting appropriate drugs. However, the aim of this study was merely to assess the effect of drug discontinuation.
The calculated sample size was low as clustering had not been considered. This led to consequences for the investigation of the study hypotheses, as only falls were significantly reduced in the intervention group, and it cannot be fully excluded that the intervention could not have entailed a significant impact on the primary endpoint or on other secondary outcomes when using a cluster-considering sample size. However, despite of the failed achievement of statistical significance for most of the endpoints, conclusions can be drawn from the significant results (reduction of falls). Moreover, as mentioned above, several previous studies using comparable interventions found similar results in terms of missing impact on mortality and hospitalisations; thus, the inadequate sample size might probably have been not the only or not the primary cause for not achieving significance in the primary endpoint.
The documentation of the time of occurrence was complete only for mortality while for the other patient-related events (hospitalisations, falls, fractures) considerable documentation gaps emerged. Therefore, a use of timeto-event analyses was not feasible for these outcomes nor for the primary endpoint. By using binary outcomes the rates of occurrence are investigated, but not potential differences regarding the time of occurrence; this leads to an information loss which has to be considered another limitation of the study.
OTC-medications were not included in the analysis because the electronic data extraction was possible only for prescribed drugs which were the only drugs recorded in the EHRs. OTC-drugs only could have been collected by questioning the participating patients; however, olderaged patients not always remember all drugs they are taking and brown bag medication reviews with each patient were not feasible within the logistic constraints of the study and have also shown limitations of accuracy [63]. Thus, a reliable and complete determination of OTCdrugs was not possible in this study and was therefore a priori excluded.
However, as in Italy most continuously taken drugs are only available on prescription, the exclusion of OTCdrugs should not have entailed a substantial bias. We also excluded PRN-medications, thus, some drugs possibly interacting with diseases or other medications could have been missed as well.
We did not evaluate the specific causes for hospitalisations and mortality e.g. differentiation between ADErelated events or other causes. Quantifying the number of drug-related events could have provided a better description of the potential link between polypharmacy exposure and hospitalisations respectively mortality.
Inter-rater agreement between experts' recommendations was relatively high (≥66%) and the same three drug classes were most frequently valued as inappropriate by all experts. However, these findings also indicate that experts' appraisal and recommendations regarding drugs vary to some extent depending on the professional and clinical background.
We did not assess the GPs' experience or satisfaction regarding the intervention. Yet, this could be an interesting subject for future studies to e.g. qualitatively investigate the GPs' experiences about a similar intervention and thus to possibly improve its degree of implementation.
Blinding of study participants was not possible due to the nature of the intervention. Yet, allocation concealment was assured as baseline data were collected before randomisation.
Inhomogeneity of some baseline characteristics across the study groups was not avoidable due to the size of the study sample and because of heterogeneity among patients and practices. Randomisation at a patient level could probably help to achieve a better balance of baseline covariates between the study groups, nevertheless, cluster-randomisation in our study was necessary to avoid contamination effects.
Both the GP and the patient sample were consecutively recruited to reduce the risk of selection bias. Yet, we enrolled only community-living patients who visited the GP office. The GP sample was small and thus not fully representative. Generalisability is also limited by the fact that our findings derive from a specific Italian region and patterns of polypharmacy might differ in other countries, as well as in populations with different baseline characteristics (e.g. with high-grade cognitive impairment). However, as stated above, our results are confirmed by other studies with comparable interventions deriving from different European countries; thus, we postulate that our results and implications might be applicable also to other national circumstances.

Conclusion
Definitive discontinuation was feasible for only one out of six inappropriate medications and for one out of three patients with inappropriate drugs. Nevertheless, a significant reduction in the number of falls was noted while other outcomes (mortality and acute hospitalisation in combination and as single endpoints, number of drugs, number of fractures, quality of life, affective status and cognitive function) were not significantly altered. Thus, our results highlight the importance of optimisation of drug therapies in older-aged patients and show that also a limited reduction of inappropriate medications can lead to positive effects without a distinct increase of undesired events.
An important finding is the low implementation rate of deprescribing suggestions and the relatively high rate of restarted drugs.
This may indicate that training for GPs about controlled and effective deprescribing could be beneficial.
Our findings point out that real improvements of patient-related end-result outcomes like mortality and hospital admissions may be hardly achievable in olderaged populations with multiple conditions. Thus, for future interventions aiming at reducing inappropriate polypharmacy, we recommend that it should be questioned if stabilisation of clinical parameters would be a more appropriate outcome goal than real improvements; this could be realised e.g. by using a non-inferiority study design.