Health profiles and socioeconomic characteristics of nonagenarians residing in Mugello, a rural area in Tuscany (Italy)

Background Health, as defined by the WHO, is a multidimensional concept that includes different aspects. Interest in the health conditions of the oldest-old has increased as a consequence of the phenomenon of population aging. This study investigates whether (1) it is possible to identify health profiles among the oldest-old, taking into account physical, emotional and psychological information about health, and (2) there are demographic and socioeconomic differences among the health profiles. Methods Latent Class Analysis with covariates was applied to the Mugello Study data to identify health profiles among the 504 nonagenarians residing in the Mugello district (Tuscany, Italy) and to evaluate the association between socioeconomic characteristics and the health profiles resulting from the analysis. Results This study highlights four groups labeled according to the posterior probability of determining a certain health characteristic: “healthy”, “physically healthy with cognitive impairment”, “unhealthy”, and “severely unhealthy”. Some demographic and socioeconomic characteristics were found to be associated with the final groups: older nonagenarians are more likely to be in worse health conditions; men are in general healthier than women; more educated individuals are less likely to be in extremely poor health conditions, while the lowest-educated are more likely to be cognitively impaired; and office or intellectual workers are less likely to be in poor health conditions than are farmers. Conclusions Considering multiple dimensions of health to determine health profiles among the oldest-old could help to better evaluate their care needs according to their health status.

aging". This has been performed extensively among less older people in recent decades. However, as a consequence of the increasing number of oldest-old people in Western societies and their health characteristics and needs, it is only in recent years that studies focusing on the oldest-old have been conducted, aiming to understand the potential drivers of good health conditions at extremely old ages [6][7][8][9][10]. These studies have always focused on a specific dimension of health, such as cognition, physical and functional status or morbidities. However, health care needs are the result of a complex system of diseases, syndromes or health characteristics that cannot be described by a single dimension of health [11][12][13][14]. To consider the multidimensionality of individual health status, it is necessary to exploit a personcentered approach that is based not on the relationships among variables but rather on the characteristics of the individuals. This approach allows people to be distinguished into groups by taking only their individual characteristics into account [11,13].
To capture the heterogeneity of health status and evaluate the social disparities among individuals, researchers suggest the use of latent class analysis (LCA) as a person-centered approach [11][12][13]. LCA is a subset of structural equation modeling suitable for addressing multidimensional concepts, as in the case of health, to find groups of cases with similar characteristics in multivariate categorical data. The use of LCA in population health studies is extensive, with applications that vary from younger [15] to older individuals and elderly people [12][13][14][16][17][18][19][20][21][22][23][24]. Some scholars used this approach to identify profiles of health by considering functional, cognitive and psychological indicators [12-14, 16, 17, 22], with some evaluating socioeconomic differences among the health profiles [12,13,17,22] and others predicting the health care expenditures of people belonging to different groups [14,16]. Other researchers have applied a personcentered approach to identify profiles within a single aspect of health, such as morbidities [15,19,25], physical status [21], and depression [20], by considering several outcomes of the same health dimension. According to the existing literature, LCA could be used to identify groups of individuals requiring specific forms of health care and to predict their health care needs and expenditures. This approach could also help policymakers understand which groups of people to target with their interventions. The recent COVID-19 pandemic has again highlighted, especially in Italy, how vulnerable people are, such as the oldest-old and multichronic patients, which are groups that merit greater health policy focus [26].
It is also well documented that among elderly adults, demographic and socioeconomic characteristics influence health status and, consequently, health care needs and utilization [13,27,28]. Fewer researchers have evaluated this relationship among extremely old people, suggesting the persistence of social disparities in health, even in the last stages of life [29]. Gender, education and income were found to be associated with different health outcomes among the oldest-old individuals, prompting further investigation in this direction [6,[29][30][31][32]. Evaluating the existence of a demographic and socioeconomic gradient in health among the oldest-old population could drive the attention of policymakers toward people who need interventions.
Despite the recognized advantage of using a personcentered approach for capturing the heterogeneity of health among elderly people, there is still not much evidence relating to health profiles among the oldest-old and the extremely-old populations [33]. To fill this gap in the literature, we analyzed data from the Mugello Study [34], which included 504 nonagenarians from a rural area in Tuscany (Italy) called Mugello. Our aim is to determine whether it is possible to classify oldest-old people according to their multidimensional health status, defined by physical, cognitive and psychological health, to help in choosing the best care needed by this growing segment of the population. Furthermore, we investigate whether there are demographic and socioeconomic differences among their health profiles, fueling the debate on social disparities in health in the last stages of life.

Study population and measures
The study population comes from the Mugello Study [10], which aimed to evaluate the aging process, focusing on different health aspects among nonagenarians living in 9 of the 11 municipalities of the Mugello area in Tuscany (Italy). It comprised 504 individuals representing approximately 65% of all nonagenarians living in that geographical territory in 2012. The participation rate was 69% after the exclusion of potential participants who died before being interviewed or who were not found. More information about the study design and survey methods is available in Molino-Lova et al. [10].
Much information about the individual health conditions of nonagenarians has been collected. For some of the health tests, it was not possible to assess the health status of several patients. Individuals who were not tested due to their (very) poor health conditions were categorized as nontestable. Being nontestable is considered the worst health condition for each of the variables, including this category. Variables have been categorized according to the existing literature. Cognitive function was measured according to the Mini-Mental State Examination (MMSE): the higher the score (0-30), the better the cognitive status is [35]. MMSE scores were divided into three categories to distinguish people with severe (0-17), mild (18)(19)(20)(21)(22)(23), and no cognitive impairment (24)(25)(26)(27)(28)(29)(30) [36]. Functional status was assessed according to the ability to perform five of the activities of daily living (ADLs) (eating, dressing, bathing, toileting, transferring) [37]. The number of ADLs that people could manage independently was used to distinguish between the non-(0), semi-(1-4), and fully-autonomous (5) oldest-old individuals [38]. Mugello's nonagenarians were classified as disease-free (0), single-disease (1), and comorbid (2+) according to the number of chronic diseases (cardiovascular, neurological, pulmonary, connective tissue, gastroenterological, endocrine, renal, oncological, immunodeficiency syndrome) reported. The Geriatric Depression Scale (GDS) was used to evaluate depression status: the higher the score (0-15), the higher the level of depression is [39]. GDS scores were divided into three categories to distinguish nondepressed (0-4), depressed (5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15), and nontestable individuals [40]. Self-rated health status was assessed using the Italian version of the Short Form-12 questionnaire (SF-12) from which it was possible to obtain the two synthetic indicators combining the 12 items together: the Physical and Mental Component Summaries (PCS and MCS) [41]. The PCS and MCS were divided into three categories: those who scored higher (or equal) than the average were considered to be in good health, those who scored lower than the average were considered to be in poor health, and nontestable individuals were considered to be in the worst health. It was also possible to obtain the global self-rated health (SRH) of the individual from the SF-12, according to the first item of the questionnaire (in general, you would describe your health status as…). It was divided into three categories to distinguish among nonagenarians declaring excellent/very good/good health, declaring acceptable/poor health and being nontestable.

Statistical analysis
Health is a complex state involving different aspects or dimensions. To capture the heterogeneity of the health status among the oldest-old individuals, we supposed that Mugello's nonagenarians could belong to unobserved or latent classes according to their health characteristics. For this purpose, we chose LCA, which aims to group individuals into classes according to their indicator patterns. Each class includes individuals with similar characteristics that nonetheless differ from the characteristics of those in other classes.
LCA was used to identify different health profiles according to the health condition through the variables described in the previous paragraph, controlling for demographic and socioeconomic characteristics. LCA with covariates is an extension of the basic LCA, permitting the inclusion of covariates to predict an individual's latent class membership [43,44].
We performed the LCA twice, including the same variables: once on the whole study population and once on the subsample of testable individuals. Since we expected to obtain in the first analysis a group populated by only nontestable individuals, we excluded those people in the second analysis to capture more heterogeneity in health status for the remaining oldest-old individuals. The effect of the covariates has been estimated with the "onestep" technique to obtain less biased coefficients: they are estimated simultaneously as part of the latent class model [45,46].
Suppose a latent class model with C classes is to be estimated according to m categorical variables and a covariate x. Let Y i = (Y i1 , …, Y iM ) be the vector of an individual's response to the M variables, where Y im = 1, 2, …, r m . Let c i = 1, 2, …, C is the latent class membership of the individual to the class; let I(y = k) be the indicator function that is 1 if y is equal to k and 0 otherwise; and let λ be the probability of membership in each latent class. Then, the latent class model can be expressed as follows: is a standard baseline category for the multinomial logistic model. In the case of one covariate, λ can be expressed as the following: where C is the reference class in the logistic regression. As a result, the log-odds of an individual falling into latent class c relative to the reference class C, giving x i as the value for the covariate, is the following: Multiple imputation was necessary to address missing values (missing at random (MAR)) to avoid a loss of precision in the analysis. The K-nearest neighbor imputation method has been used for its high performance with survey data [47]. To obtain unbiased results, neighbors are found considering all the variables available in the dataset except those that are included in the models. Five neighbors were considered to calculate the aggregated values to impute.
Education, main occupation during the working lifespan, MMSE score, ADLs performed, number of chronic diseases, PCS and MCS were imputed. None had more than 7% missing values. More information about data imputation is included in Table S1 in Additional file 1.

Results
The 504 participants included a high number of women (369); the female/male sex ratio of 2.73 confirms the higher longevity of women. The mean age ± standard deviation was 93.1 ± 3.3 in the whole study population: the men's mean age (92.5) was lower than the women's mean age (93.3; t-test p = 0.01). Men were more educated (64.5% of males vs 46.1% of females completed more than 3 years of school) but performed more physical jobs: 80% of males vs 52.6% of females were farmers or low-skilled workers. Overall, men had better scores on all the health measures considered in the analysis. This result is partially explained by the sex-specific age structure of the study population. Large gender differences were found in cognitive and functional status (60.7% of males vs 37.1% of females were not cognitively impaired; 61.5% of males vs 43.6% of females were autonomous). The gap in the remaining health measures is mainly due to the larger number of nontestable women (Table 1).
Three latent classes were found when both the whole study population and the subsample of testable individuals were considered. This number was chosen according to the "meaning" of the classes, together with the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), whose values are shown in Table 2. Every latent class has been labeled according to the posterior probabilities (λ) of finding a certain characteristic in the class, as shown in Table 3.
LCA performed on the whole study population resulted in three health profiles. The first class is characterized by a high probability of being autonomous (λ = 0.89), not depressed (λ = 0.81), not cognitively impaired (λ = 0.78), perceiving good SRH (λ = 0.92), and having values of PCS and MCS higher than or equal to the average (respectively, λ = 0.73 and 0.65). This class, labeled the "healthy group", includes 215 individuals (42.9% of the whole study population). The second class is characterized by a high probability of being semi−/not autonomous (respectively, λ = 0.47 and 0.44), cognitively impaired (λ = 0.97), and not testable for depression (λ = 0.97) and SRH (λ = 1); consequently, PCS and MCS were not testable (λ = 1 for both indicators). This class has been labeled the "severely unhealthy group". It includes 110 individuals (21.8% of the whole study population), which encompassed almost all nontestable nonagenarians according to the scales in analysis that included this category (SRH, depression, PCS and MCS). The third class includes nonagenarians with a high probability of being semiautonomous (λ = 0.72), mild/severely cognitively impaired (respectively, λ = 0.32 and 0.40), depressed (λ = 0.74), and having PCS and MCS scores lower than the average (respectively, λ = 0.74 and 0.66). Despite how they performed in the objective health measures, they frequently declare a better health status: λ = 0.43 for declaring good SRH conditions is relatively high (poor SRH: λ = 0.57). For this reason, the last class, composed of 179 (35.3%) individuals, has been labeled the "partially satisfied unhealthy group".
LCA performed on the subsample of testable individuals also resulted in three health profiles. The first class is characterized by a high probability of being autonomous (λ = 0.88), not depressed (λ = 0.82), not cognitively impaired (λ = 0.83), reporting good SRH (λ = 0.91), with PCS and MCS scores higher than or equal to the average (respectively λ = 0.71 and 0.67). This class has been labeled the "healthy group". It includes 202 individuals (53% of the testable subsample) who were almost the same individuals populating the "healthy group" resulting from the first analysis. The second class is characterized by a high probability of being semiautonomous (λ = 0.7), depressed (λ = 0.81), and reporting poor SRH (λ = 0.74), with PCS and MCS scores lower than the average (respectively λ = 0.91 and 0.65). This group of 128 individuals (33.3% of the testable subsample) has been labeled the "unhealthy group". The third group is characterized by a high probability of reporting good SRH (λ = 1) and being semiautonomous (λ = 0.60), mild/severe cognitive impairment (respectively λ = 0.43 and 0.48), with MCS scores lower (λ = 0.74) but PCS scores higher than or equal to the average (λ = 0.88). Posterior probabilities for depression are similar: λ = 0.43 not-depressed vs λ = 0.57 depressed. This group was labeled "physically healthy with cognitive impairment". It included 55 nonagenarians (13.7% of the testable subsample). All the posterior probabilities are reported in Table 3.
The first class has been labeled the "healthy group" in both analyses: posterior probabilities followed a similar pattern, especially in terms of (good) health status items, as shown by the black and white circles in Fig. 1. The second class of the analysis on the whole study population was named the "severely unhealthy group" (see black squares in Fig. 1). It was composed of almost all the nontestable nonagenarians: individuals in the worst health conditions. Excluding the nontestables for the second analysis, many individuals populating the third class moved to the second, resulting in an "unhealthy group" with less extreme health characteristics. The consequence of this exclusion was more evident for the last (third) class obtained in both analyses. When considering all nonagenarians, we obtained the "partially satisfied unhealthy group", i.e., people mainly in poor health conditions but not always declaring poor SRH. When excluding the nontestable nonagenarians, some of the individuals populating the third group obtained in the previous analysis moved to the second group in the second analysis. As shown in Fig. 1, the "partially satisfied unhealthy group" (first analysis) and the "unhealthy group" (second analysis) had similar posterior probabilities for the (good) health status indicators, especially in terms of functional and cognitive status. Within the second analysis, 55 out of the 385 nonagenarians composing the "physically healthy with cognitive impairment group" had a higher probability of declaring good SRH and obtaining a high PCS score than the "healthy group", but they had poor cognitive health, sometimes had depression and were mainly semiautonomous nonagenarians. The results are controlled for age, gender, education, and main occupation during the working lifespan (Table 4). In the analysis on the whole of Mugello's nonagenarians, older individuals and housewives are more likely to be part of the "severely unhealthy group" instead of the "healthy group" (92-94 vs 90-91: odds ratio (OR) = 2.69; 95+ vs 90-91: OR = 7.25; housewives vs farmers: OR = 2.19), while being more educated reduces these odds (4-5 vs 3 years of education: OR = 0.49; 5+ vs 3: OR = 0.08). Being older also increases the odds of  Empty items are due to the subsampling: not testable individuals are not included in the second analysis For both analysis 1: "healthy group"; respectively 2: "Severely unhealthy group" and "Unhealthy group"; and respectively 3: "Partially satisfied unhealthy group" and "Physically healthy with cognitive impairment group"

Discussion
To identify health profiles among nonagenarians from Mugello (Tuscany -Italy), LCA was performed twice: first on the whole study population and then on the subsample of testable individuals, with nonagenarians in the "extreme" (worst) conditions having been excluded from the analysis. Removing these individuals from the analysis allowed us to capture more heterogeneity of health among the remaining oldest-old, especially among those with poor health that were hidden by the nontestable individuals.
In both analyses, three classes were identified, resulting in a total of four different health profiles within the two LCAs performed, each labeled according to the posterior probabilities of finding certain health characteristics in them. Other researchers who looked at health profiles among elderly people by considering their physical, cognitive and psychological status found two to six classes [11-13, 17, 22]. In particular, other researchers could distinguish between a larger number of classes (four to six) [11,13,17,22], except for Ng et al. (2014), who identified only two profiles [12]. The fact that we found four health profiles within the two analyses means that, even at extremely old ages, there is still heterogeneity in the health conditions of the individuals. LCA allowed us to take into account the multidimensionality of health by including several health measures in the analysis. Having a larger study population could have helped to find the four profiles within a single LCA.
The "healthy group" (a), identified in both analyses and composed of almost the same individuals, and the "unhealthy group" (c), resulting from the second analysis, are consistent with other scholars' findings among younger adults, including information on sensory health and specific chronic diseases [11,16] or quality of life and wellbeing [17]. Additionally, among nonagenarians, it was possible to find the two extreme groups of people in overall good and poor health. The "severely unhealthy group" (b), resulting from the first analysis, confirms that nontestable individuals are a stand-alone group of Fig. 1 (Good) health status item probabilities (λ) per health status resulting from the two latent class analyses (LCAs). Note 1: Class 1: "Healthy group", for both first (a) and second (b) LCAs; Class 2 for LCA-A: "Severely unhealthy group", for LCA-B: "Unhealthy group"; Class 3 for LCA-A: "Partially satisfied unhealthy group", for LCA-B: "Physically healthy with cognitive impairment group". Note 2: ADLs: Activities of Daily Living; MCS: Mental Component Summary; PCS: Physical Component Summary; Positive self-rated health: excellent/very good/good self-rated health people who, because of their extremely bad health conditions, cannot be tested on their health status. The "physically healthy with cognitive impairment group" (d), i.e., individuals with good self-rated health and physical condition but bad cognitive status, is similar to what Lafortune et al. (2009) called the "cognitively impaired group" in their paper on the Canadian elderly, where the authors did not include information on the perception of health [11]. However, this result is at odds with what Zammith and colleagues found in 2012, in terms of selfperceived health, among the Lothian Birth Cohort 1936 "good fitness/low spirit group" [11,17]. It is known that one of the factors influencing the assessment of health among Italian elderly people is their physical status [50]. It is possible that, even at extremely old ages, physical health plays an important role in the self-assessment of health status. However, this could also be the result of the poor cognitive status of individuals populating the "physically healthy with cognitive impairment group".
Certain demographic and socioeconomic characteristics were found to be associated with being part of some of the latent classes found. In this study, it is not possible to evaluate the health deterioration itself, but even at extremely old ages, being older results in having a higher probability of being in worse health. This suggests the need for further investigation on the health deterioration process among the oldest-old as it is commonly performed on the younger-old [51][52][53]. Males have a lower probability of being in worse general health conditions, confirming the so-called "gender paradox" also exists among the oldest-old: men are healthier than women at older ages [6,29,31,54]. The level of education is known to be associated with cognitive health in later life. Researchers analyzing English and Finnish nonagenarians show how this relationship still persists at extremely old ages [29,32,55]. In the present study, more educated nonagenarians are less likely to belong to an "unhealthy group", while being less educated increases the probability of being among the cognitively impaired. These results are similar to those found in younger-elderly profiles [12,13]. Working experience is also associated with health conditions, showing different results. In line with the existing literature, a person who was a nonmanual (office) worker had a lower probability For both analysis 1: "healthy group"; 2; respectively 2: "Severely unhealthy group" and "Unhealthy group"; and respectively 3: "Partially satisfied unhealthy group" and "Physically healthy with cognitive impairment group" of being in bad health condition at older ages compared to someone who worked as a farmer [56,57]. Housewives were more likely to be in the worst health conditions, similar to study findings among Finnish nonagenarians [29]. This study has public policy implications that need to be noted. Even among nonagenarians, individuals are heterogeneous in terms of health. To capture this heterogeneity by taking into account several dimensions of health, it is necessary to apply a suitable methodology. LCA has been widely used for this purpose, and policy makers should take advantage of it to identify heterogeneous groups of individuals to target with their interventions [11][12][13][14]. Analyzing different health dimensions at the same time allowed us to distinguish between the most vulnerable individuals with several health problems and those individuals with dimension-specific health deficits. According to our results, it is likely that people with poor physical health also have cognitive impairment, resulting in complex care needs. However, cognitively deteriorated individuals may be in good physical and functional status, requiring a different (specific) type of health assistance. Furthermore, health profiles were associated with socioeconomic status, showing that even among the oldest-old, the well-known socioeconomic gradient of health persists. As pointed out by Ng et al. (2014), this should suggest policy makers drive their interventions to the less advantaged groups of the population [12]. Other researchers evaluated the health care needs and expenditures among Taiwanese elderly people [14,16], showing how they differ among the health profiles that they identified. Being able to distinguish between groups of people with different health care needs is extremely important for reducing the excess of health expenditure that may result from not considering it holistically [11].
This study has limitations that need to be noted. It is based on a cross-sectional dataset: health characteristics have been collected only once. For this reason, we were not allowed to study the causal relationship between sociodemographic characteristics and health status and profiles. Furthermore, much of the information about health status is self-reported, and cutoff points -chosen according to the existing literature -did not equate to a clinical diagnosis. Thus, it would be useful to verify their veracity with objective measures. Finally, it is important to remark that Mugello's nonagenarians are a selected group of individuals in terms of health and mortality. Living in a rural area and following a Mediterranean diet is, for instance, something that affects this selection.

Conclusions
Large samples of nonagenarians, for which much information has been collected about their health status, are still rare to find. Considering health as a multidimensional concept by identifying health profiles could help to better evaluate the care needs according to the different health profiles of each person, even among extremely old individuals [16,58]. The demographic and socioeconomic gradient of health resulting from the analysis suggests that policy makers focus their interventions on specific groups of individuals at younger ages to prevent an excess of health care expenditure later on.
Additional file 1: Table S1. Marginal distribution pre-and post-missing values imputation of characteristics of the study population. Absolute values, percentages and differences. SRH: Self-Rated Health; WHO: World Health Organization CS, PP contributed equally to the conception of the study. CL, FV, CM, LP contributed to data acquisition. CS, PP, VE contributed to the data analysis and the interpretation of the results. All authors contributed to the drafting of the study. All authors read and approved the final manuscript. All authors agreed on both to be personally accountable for the author's own contributions and to ensure that questions related to the accuracy or integrity of any part of the work, even ones in which the author was not personally involved, are appropriately investigated, resolved, and the resolution documented in the literature.

Funding
The Mugello Study was partially supported by the Italian Ministry of Health within the Current Research Program performed at National Research Institutes (IRCCS). The authors received no financial support for the research, authorship, and/or publication of this article.

Availability of data and materials
The data that support the findings of this study are available from Mugello Study but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Mugello Study.
Ethics approval and consent to participate The Mugello Study was conducted according to the Helsinki Declaration on Clinical Research Involving Human Subjects and was approved by the Don Carlo Gnocchi Foundation Ethics Committee. Informed written consent was obtained from all the participants, or their proxies, before their inclusion in the study. Further details on the survey, including information on the territory and inhabitants, are available on the web (www.mugellostudy.com).

Consent for publication
Not applicable.