Study design and participants
This study used data from four waves of the China Health and Retirement Longitudinal Study (CHARLS 2011–2018), a nationally representative longitudinal survey of the residents in China 45 years of age and above along with their spouses. To achieve sample representativeness, a multistage probability sampling approach was used. In 2011, respondents were interviewed face to face, and they were followed up in 2013, 2015, and 2018. The response rate for the baseline survey was 80.5%, and the follow-up response rates for 2011, 2013 and 2018 were 82.6%, 82.1% and 83.8% respectively [26]. CHARLS was authorized by Peking University’s Ethical Review Committee (IRB00001052-11014). Prior to participation, each participant signed an informed consent form. The cohort profile literature contains detailed explanations of the survey design and processes used in the CHARLS [27].
In this study, our analysis was limited to participants aged 60 and over. To take full advantage of data from four follow-ups, individuals who had reached the age of 60 and above during the study period (2011–2018) and had at least two (range 2–4) cognitive function assessments after 60 were eligible for inclusion. The participant’s first interview record was considered as the baseline (i.e., wave 1, 2, or 3, depending on when they joined CHARLS). For analytical purposes, we excluded: (1) participants without adult children (adult children were defined as children age 22 or older who were not schooled at that time); (2) participants with missing information on living arrangements at baseline; (3) participants with brain-damaged, mentally deficient, psychiatric problems or memory-related disorders at baseline; and (4) participants with missing information on demographic characteristics (gender, geographic residence, education, working status), health status (physical comorbidity, feeling pain, instrumental activities of daily living [IADLs], depressive symptoms, social activity participation), child characteristics (the number of adult children, average years of schooling of adult children), and socioeconomic level (average annual household expenditure per capita). The final sample is comprised of 6074 older respondents without missing key variables. We compared the baseline characteristics between participants included in the final analysis and the others who were excluded due to data missing (Table S2 in the Supplementary Material). The recruitment flow chart of the current study is indicated in Fig. 1.
Measures
Cognitive function
Consistent with previous studies [28,29,30,31], cognitive function was captured using three categories: mental status, visuo-construction and episodic memory. Questions about orientation and numeric ability were used in CHARLS to measure mental status. Orientation was measured by asking respondents to identify the date (month, day, year), season, and day of the week. Numeric ability was measured by serial subtraction of 7 from 100 (up to five times). Based on the number of correct answers, scores on these questions were summed into the mental status score and ranged from 0 to 10. The score of the visuo-construction was recorded as 1 if the participants could replicate a figure previously displayed; otherwise, it was recorded as 0. Episodic memory was evaluated by a word recall test. Participants were asked to recall as many of the 10 unrelated Chinese words they had just heard as they could (immediate recall). Five minutes later, they were tasked with recalling the identical list of words (delayed recall) [30]. Episodic memory scores were calculated as the average score for immediate and delayed word recalls, ranging from 0 to 10.
The cognitive function scores varied from 0 to 21 and were calculated as the total of the mental status, visuo-construction and episodic memory scores. The higher the score, the better the cognitive function. The Cronbach’s alpha is 0.78 [32], which shows a satisfactory level of internal consistency.
Living arrangements
To fully explicate important details concerning different types of living arrangements and answer the research hypothesis, living arrangements were divided into the following five mutually exclusive categories based on the baseline survey: (A) living alone. (B) living with spouse (no adult children, may have others). (C) living with adult children (no spouse, may have others). (D) living with both spouse and adult children (may have others). (E) living with others who are not spouse or children.
Covariates
Given that cognitive function and living arrangements may differ depending on demographic characteristics, health status, child characteristics, and socioeconomic level, the following variables were included in this study as covariates. A time variable was also included that accounted for the number of years elapsed since the baseline interview.
Demographic characteristics included age (at baseline), gender (male or female), geographic residence (urban or rural), education (no formal education, capable of reading and/or writing, primary school, middle school and above) and working status (yes or no).
Health status was measured according to physical comorbidity, feeling pain (yes or no), instrumental activities of daily living (IADLs) (impaired or unimpaired), depressive symptoms (yes or no), and social activity participation (yes or no). Physical comorbidity data included conditions for which respondents self-reported receiving a diagnosis from a physician, such as dyslipidemia, diabetes or high blood sugar, chronic lung disease, etc. The number of physical comorbidities was calculated and categorized as 0,1–2 and ≥ 3. Feeling pain was self-reported via a question: “Are you often troubled with any body pains?”. IADLs were evaluated by the Lawton and Brody’s scale referring to doing housework, cooking, taking medicine, shopping, and taking care of finances [33]. Participants who reported having any difficulty in any items were classified as with IADLs impaired [34]. The Chinese version of the 10-item Center for Epidemiologic Studies Depression (CESD-10) Scale was used to measure depressive symptoms, which reflected the respondents’ depressive symptoms experienced over the last week. The ten items included three items on depressed mood, five items on somatic symptoms and two items on positive mood. Except for two items on positive emotions which were reverse-scored, the other eight items were scored 0,1,2,3 according to their frequency of symptoms. The total CESD-10 score for the 10 items ranges from 0 to 30, with higher scores indicating more severe depressive symptoms. Participants with a CESD-10 score above 10 points were sorted as depressed [35]. The CHARLS questionnaire included eleven categories of social activities. Participation in social activities was defined as the respondent having participated in at least one of these social activities in the last month.
Child characteristics included the number of adult children of respondents and average years of schooling of adult children which was defined to assess the overall educational attainment of adult children. The number of adult children was classified into three categories: 1, 2–3 and ≥ 4. The average years of schooling of adult children was centered by subtracting the mean value.
Following previous studies [36, 37], we calculated the average annual household expenditure per capita to measure the household resources. In developing countries, expenditure is a better way to assess the economic resources available to households than income. The measurement of expenditure also has less error than income. To capture the non-linear relationship between income and outcome variables, the average annual household expenditure was log-transformed in the analysis.
Additional details of covariates are available in supplemental Methods and Table S1 in the Supplementary Material.
Statistical analysis
We used descriptive statistics to describe the characteristics of respondents. Continuous variables were presented as the mean and standard deviation (SD). Categorical variables were presented as frequency (n) and percentage (%). t-test and chi-square test were used to identify significant differences in characteristics between males and females.
Multilevel models were used to assess the relationships between cognitive decline and living arrangements. Multilevel models are the optimal approach for analyzing nested data that are not independently observed (e.g. time points within individuals) and contradict the assumption of independent observations [38]. An important advantage of multilevel growth models is that they can handle unbalanced data, which means that they do not require the same number of measurement occasions per individual to obtain efficient estimates [39]. In this study. The data structure was that up to four waves of repeat measurement data (level 1) were nested within 6074 individuals (level 2).
Using data from up to four waves of data collection, we estimated three models for all respondents first and then separately for males and females. Model 1: adjusted model containing time, living arrangement and part of the covariates (age, gender and geographic residence [urban or rural]). Model 2: adjusted model containing time, living arrangement and all of the covariates (age, geographic residence, education, working status, physical comorbidities, feeling pain, IADLs, depressive symptoms, social activity participation, number of adult children, average schooling year of children, household expenditure per capita). Model 3: add the interacting term of living arrangement and time to Model 2. It was built to address our hypotheses concerning the association between living arrangements and cognitive decline. Older people living with their spouse was regarded as the reference group. The differences in the rate of cognitive decline between living with spouse and other types of living arrangements were indicated by the regression coefficients of living arrangements × time. To examine potential gender-specific effects, a stratified analysis by gender was conducted. Living arrangements and all covariates from baseline evaluation were treated as time-invariant.
Considering that a large proportion of participants were excluded due to missing data, a sensitivity analysis was performed using multiple imputation (multilevel joint modelling multiple imputation) [40]. The results were similar after multiple imputation (shown in Table S3 and S4 in the Supplementary Material). We did not apply sampling weights in present study. Because the use of sampling weights in estimating causal effects and multilevel analysis is controversial and ambiguous [41,42,43,44,45]. And several studies using CHARLS data have shown that the results of weighted and unweighted analyses were similar [46, 47] .
All descriptive analyses were conducted using STATA version 16.0 software, and multilevel analyses were performed using MLwiN 2.30 software. P < 0.05 was regarded as statistically significant.