Study cohort
CHARLS is a population-based, prospective cohort study that aims to collect a set of high-quality micro-data representing families and individuals in China to analyze the population aging issues and to promote interdisciplinary research on aging [18]. The survey adopted a four-stage, stratified, cluster probability sampling process to sample 17,708 middle-aged and older individuals from 150 counties in 28 provinces. The first visit of CHARLS was launched in 2011 and participants subsequently completed 3 follow-up visits (wave 2 in 2013, wave 3 in 2015 and wave 4 in 2018).
We excluded individuals with missing information on depressive symptoms, mobility, subjective memory, multimorbidity and other covariates, as well as individuals who were younger than 45 years, or were lost to follow-up in waves 2–4. In addition, individuals with cognitive impairment (defined as cognitive scores < 6 [1.5 SD below its mean]) were also excluded [19, 20]. A total of 5196 subjects who completed 4 visits conducted every 2–3 years were included in this study. Fig. S1 in Additional file 2 shows the detailed population selection process. Table S1 in Additional file 1 presents the baseline characteristics of excluded respondents (n = 12,512) who were generally older, more likely to be women and unmarried, less likely to be smokers or drinkers, more likely to have a lower educational level and a lower level of household income.
Measures
Depressive symptoms
The Center for Epidemiologic Studies Depression Scale (CESD) was used to assessed depressive symptoms, which had been validated previously in Chinese older adults [21]. There were 10 questions in total about the frequency they had experienced any of these 10 symptoms in the past week. Participants responded to these questions on a 4 point scale (0 = rarely; 1 = some days; 2 = occasionally; 3 = most of the time). The total score was calculated by summing the scores of the 10 questions, and ranged from 0 to 30 points. Depression symptoms were defined as CESD scores ≥10. Participants who had a complete assessment of depressive symptoms at each of the 4 visits were included. In this study, depressive symptom was used as a continuous variable based on the total score of CESD.
Multimorbidity
Participants were provided a list of 14 chronic conditions and were asked to select the conditions that their doctor diagnosed and that lasted at least half a year. Multimorbidity was defined as a binary variable where one group had one or none of the 14 chronic diseases and the other group had two or more chronic diseases.
Mobility
A scale of 9 items which has been proven to have good reliability was used to assess mobility, including running or jogging about 1 km; walking 1 km; walking 100 m; getting up from a chair after sitting for a long period; climbing several flights of stairs without resting; stooping, kneeling, or crouching; reaching or extending arms above shoulder level; lifting weights over 10 jin and picking up a small coin from a table [22]. Each item was coded as a dichotomous variable (0 = ‘no, I don’t have any difficulty’, 1=‘I have difficulty but can still do it’, ‘yes, I have difficulty and need help’ or ‘I cannot do it’). The summary score was obtained by adding the scores of the 9 items and transformed into disabled (summary scores ≥1) or not disabled (summary scores = 0) [22].
Subjective memory
Subjective memory was evaluated using a single item: ‘How would you rate your memory at the present time?’ with answer defined using 5 categories (excellent, very good, good, fair and poor) [23]. Responses for subjective memory impairment were recoded into 3 categories (excellent/very good/good, fair, and poor) for analytical purposes, considering that a small number of participants reported excellent, very good and good subjective memory (with percentage of 0.4, 6.4 and 15.2%).
Covariates
Covariates included age, gender, residence (rural and urban), marital status (married and unmarried), smoking status (smoker and non-smoker), drinking (drinker and non-drinker), educational level, household income and cognition scores. Smoking status was evaluated using two questions: “Have you ever chewed tobacco, smoked a pipe, smoked self-rolled cigarettes, or smoked cigarettes/cigars?” and “Do you still have the habit or have you totally quit?” Drinking was determined using a single item: “Did you ever drink alcoholic beverages in the past? How often?” Educational level was coded as 4 categories (< primary school, primary school, middle school and ≥ high school). Household income was recoded according to tertiles (low, medium and high). Cognitive function was assessed through two categories including episodic memory and mental intactness. Immediate word recall and delayed word recall were used to evaluate episodic memory (range 0–20). Telephone Interview of Cognitive Status (TICS) was used to measure mental intactness. The TICS consisted of serial subtraction of 7 from 100 (range 0–5), the date (month, day, and year), day of the week, season of the year (range 0–5), and intersecting pentagon copying test (range 0–1). The total score was calculated as the sum of the items mentioned above (range 0–31).
Statistical methods
The raw depressive symptoms scores were adjusted for age by regression analyses, and the predicted depressive symptoms scores were transformed using the following equation to obtain the adjusted z-scores:
$$z=\frac{Y-\overline{Y\hbox{'}}}{RMSE}$$
where Y is the raw depressive symptoms score, Y′ is the predicted mean score of depressive symptoms, and RMSE is the root mean square error for the regression model. The transformed z-scores were used in subsequent analyses [19, 24].
Group-based trajectory modeling (GBTM), based on a censored normal distribution, was conducted to identify distinct trajectories of depressive symptoms z-scores. GBTM is a finite mixture modeling application that uses trajectory groups as a statistical device to identify distinctive clusters of trajectories across the population over time or age and profile the characteristics of individuals within the clusters [25, 26]. GBTM assumes that the distribution of population is discrete but there is no intra-class variation among individuals in the same cluster. To determine the optimal number of groups that can best represent the heterogeneity of developmental trajectories, we first fitted a single model with 1 group and then iteratively expanded to 5 groups as a function of follow-up time. Follow-up time and its higher-order terms (up to cubic terms) were included one by one for model building. The model selection was determined by the following criteria [27]: high mean posterior probabilities (> 0.7); greater membership in each trajectory group (≥ 5.0%); a reduction of Bayesian information criterion (BIC) of at least 20. Higher-order terms were removed from the model if they were not significant or did not improve the goodness-of-fit of the model.
To compare characteristics of multiple different trajectory groups, Mann-Whitney test and Kruskal-Wallis test were used for continuous variables, and χ2 test was used for categorical variables. Multinomial logistic regression model was used to investigate the associations between multimorbidity, mobility, subjective memory and the trajectories of depressive symptoms z-scores. Multimorbidity, mobility and subjective memory were entered into the multinomial logistic regression models together, and the odds ratios (ORs) and corresponding 95% confidence intervals (CIs) were reported. We added covariates sequentially into 3 models: unadjusted in model 1; in model 2, adjusted for baseline age, gender, region, education level, marital status, household income, smoking and alcohol drinking; and in model 3, additionally adjusted for baseline cognition scores.
In sensitivity analyses, multinomial logistic regression models were performed separately in women and men to examine the potential gender differences in the relationship between multimorbidity, subjective memory, mobility and the trajectories of depressive symptoms z-scores. In addition, participants who had been treated for depression during the follow-up were excluded. Trajectory groups were determined using a SAS macro (PROC TRAJ) and other statistical analyses were conducted using R 4.0.3. The statistical significance level was set at P < 0.05.