Brief cognitive assessment in a UK population sample – distributional properties and the relationship between the MMSE and an extended mental state examination

Background Despite the MMSE's known flaws, it is still used extensively as both a screening instrument for dementia and a population measure of cognitive ability. The aim of this paper is to provide data on the distribution of MMSE scores in a representative sample from the UK population and to compare it with an extended cognitive assessment (EMSE) which covers a wider range of cognitive domains and provides a wider range of difficulty levels. Methods The MMSE and the EMSE were administered to over 12,000 participants at the screening stage of the MRC Cognitive Function and Ageing Study (MRC CFAS). MRC CFAS is a multi-centre population-based study in England and Wales with respondents aged 65 years and older. Results Normative values on the MMSE and EMSE are presented by age group, sex and level of education. There are very large differences between age groups, with smaller differences seen between the sexes and by level of education. The EMSE extends the scores at the high end of the ability range, but is no better than the MMSE at differentiating between dementia and non-dementia. Conclusion Population-derived norms are valuable for comparing an individual's score to the score that would be expected among the general population, given the individual's specific demographic characteristics.


Background
The Mini-Mental State Examination (MMSE) was developed almost 30 years ago as a screen for dementia among hospital patients [1]. It remains the most widely used short cognitive test in clinical practice, clinical research, and epidemiological studies [2,3] However, its shortcomings are well known [4][5][6]. Principal amongst them are (a) very limited coverage of memory function, (b) a ceiling effect, resulting in inability to differentiate moderate from high functioning, and (c) absence of information about some aspects of cognitive function required for dementia diagnosis using internationally agreed criteria (DSM-IV [7]; ICD-10 [8]), namely perception and executive function. There is therefore a need for a short screening test that covers the range of cognitive processes required by diagnostic criteria, avoids the ceiling effect, and has an improved coverage of memory.
The Modified Mini-Mental State (3MS) Examination [4] was developed to extend the range of items in the MMSE and avoid ceiling and floor effects. It added eight extra items to the 19 items of the MMSE (date and place of birth, counting backwards from five to one, naming a body part, an easy repetition item, animal naming, a similarities item, and a delayed recall test), as well as a much extended scoring range for the original MMSE items, increasing the total score from 30 to 100. While the additional items extend the coverage of the MMSE by assessing additional aspects of cognitive function, i.e. remote memory (date and place of birth) and executive functioning (animal naming, similarities), the additional items are mainly designed for the low end of the ability range rather than the high end. Moreover, the 3MS still omits the assessment of perceptual ability which is required by diagnostic criteria to establish whether there is evidence of agnosia (DSM-IV [7]; ICD-10 [8]). A large populationbased study comparing the MMSE and the 3MS in a Canadian sample of people aged 65 years and over reported that the superiority of the 3MS over the MMSE appears more due to its extended scoring system than to its additional questions [9]. The 3MS has an additional drawback: it uses non-standard versions of some of the MMSE items and additional items. Specifically, for the memory task, rather than using three high frequency object nouns such as "apple -table -penny" which became standard in the community version of the MMSE [10], the 3MS uses the words "shirt -brown -honesty", which are more difficult to remember as they do not form a single visual image. Also the MMSE item asking subjects to write a sentence of their own choosing has been replaced in the 3MS by writing a sentence to dictation, which is far easier as it does not require the subject to generate a sentence. With regards to the additional items, the animal naming task gives the subject 30 seconds to name 4-legged animals, in contrast to the standard semantic fluency task which allows one minute to name any animals. Likewise, the 3MS uses a non-standard similarities question, ("In what way are an arm and a leg alike?"). These deviations mean that the 3MS is not strictly comparable either with the MMSE or with other standard cognitive tasks.
In September 1986, before the publication of the 3MS, the MRC convened an Alzheimer's Disease Workshop [11] whose aim was to establish the minimum dataset that should be collected in research studies on dementia. This included demographic data, history of physical and psychiatric disorder, alcohol and drug use, onset and dura-tion of any difficulties, a physical examination, and a cognitive assessment. For the cognitive assessment, the MRC report recommended using the standardised administration and scoring instructions for the community version of the MMSE [10]. In addition, the MRC report recommended including the following items: category fluency (animal naming), recalling a name and address, assessment of remote memory, assessment of recent memory, ideational praxis, abstract thinking (similarities), and recognition of objects from unusual views. The specific aim of the additional items was to broaden the coverage of the MMSE in relation to both content and level of difficulty. The content requirement was to meet the needs for diagnostic criteria for dementia by including measures of perception and executive function; the difficulty requirement was to meet the need to differentiate between scores at the high end of the ability range. Individuals whose premorbid cognitive ability was high, might continue to obtain high scores on the MMSE and thus be missed on the MMSE screening test, even though their ability had in fact declined. It was hoped that the EMSE would measure more readily at higher cognitive abilities, and thereby differentiate between individuals at the high end of the ability range, therefore there would be scope to detect decline from an initially high level of functioning.
The MMSE items and most of the additional cognitive items recommended by the MRC report were included at the screening stage of the multi-centre MRC Cognitive Function and Ageing Study [12] and the findings are reported here. The screening test also included a measure of prospective memory (remembering to carry out an action), which has been reported elsewhere [13].
The aim of this paper is to describe the population distribution of performance on the MMSE and an extended cognitive assessment (EMSE) in a representative UK population.

Study design and population
The Medical Research Council Cognitive Function and Ageing Study (MRC CFAS) is a longitudinal populationbased cohort study that involves six different study centres. The six centres were chosen because they represent the main national variation with regards to urban-rural differences, the north-south and east-west gradients, and variation in socio-economic levels and in known rates of chronic disease. Furthermore, all centres had existing researchers who were experienced in conducting population-based studies of the elderly. Urban sites included Liverpool, Newcastle, Nottingham, and Oxford, and rural sites included Cambridgeshire and Gwynedd, in North Wales. Liverpool was not included in this particular analysis because it was funded earlier than the other sites and had a different design without the same extended measurement of cognition. The full study design of the five identical centres is described in detail elsewhere and is explained briefly here [12]. Random samples of subjects over the age of 65 were selected from the Family Health Service Authority lists, giving an interviewed sample of approximately 2,500 people in each centre, stratified for equal numbers aged 65-74 years and 75 years and above. The study is longitudinal and this analysis focuses on information obtained at baseline -the prevalence (first) wave of the study. There were two phases at the prevalence wave. The first, screening, stage was used to establish level of cognitive performance and baseline risk factors on all individuals. A median of three months later, 20% of the subjects had a more detailed assessment interview to establish dementia diagnosis using the Geriatric Mental State (GMS) Automated Geriatric Examination Computer Assisted Taxonomy (AGECAT) diagnosis [14]. This group included the majority of individuals identified by the screening interview as potential cases of dementia, plus a random subset of the remaining population.

Cognitive measures
The screening interview included the Mini-Mental State Examination (MMSE) in the version developed for field surveys [10]. Spelling 'WORLD' backwards as an alternative to serial sevens was omitted to enhance standardisation. This version forms part of the Cambridge Cognitive Examination within the CAMDEX interview [15] and detailed administration and scoring instructions have been published [16,17]. The screening interview also included a selection of additional questions recommended by the MRC Alzheimer's Disease Workshop [11] as described previously. Coverage of language skills was extended by adding two objects to be named to the original MMSE objects. Praxis was extended by adding writing to dictation (a name and address) to the MMSE item writing a sentence, which requires the ability to write and to generate a sentence [4]. The coverage of memory was extended with three additional items: (a) asking subjects to recall the four objects they had named earlier in the session; (b) asking subjects to recall the name and address they had written earlier in the session; (c) a set of five questions assessing semantic memory or general knowledge. Executive function was assessed using a category fluency task (naming animals in one minute) and two similarities items. Perception was assessed by showing three photographs of familiar objects taken from unusual angles. These photographs and several of the other additional items (animal naming, writing to dictation, recalling the name and address, and one of the similarities items) were taken from the CAMCOG, and details of administration and scoring can be found elsewhere [16,17] At screen, subjects were assigned an organicity score using the organic symptoms component of the AGECAT computerised algorithm [14]. The AGECAT at screen uses nine questions to obtain a level of organic symptoms. This algorithm is based mainly on interviewer ratings and the only items common to the MMSE and AGECAT are orientation items (place name and address, and current dateday, month and year). The only further item in common between AGECAT and the EMSE is naming the current UK Prime Minister. The organicity score ranges from O0 to O5, with O3 indicating mild organic symptoms, and O4 and O5 indicating probable dementia diagnosis.
Twenty percent of the screened sample went on to the diagnostic assessment interview. This included the majority of those who had a screen AGECAT organicity score of O3 and above. Individuals with a score of O3 at screen could have had dementia, but they are a mixed group with mild organic symptoms that could relate to dementia, mild cognitive impairment or depression. Of this group, those who went on to receive a diagnosis of dementia at the assessment interview were regarded as demented for this analysis. All other individuals who scored O3 and the interviewer reported moderate to severe memory impairment were excluded from the analysis for non-dementia norms.

Analysing performance on MMSE and EMSE
When describing performance on the MMSE and the EMSE, age was grouped into five-year bands (65-69, 70-74, 75-79, 80-84, 85 and above). Education was grouped into low level (9 years of schooling or less) and high level (greater than 9 years of full-time education). Those who had missing data about their educational attainment (n = 337) were placed in the low group.

Normative tables
To derive normative data on the MMSE and the EMSE, individuals classified as demented at screen or at assessment were excluded (n = 627). Tables by age group and sex, as well as by age, sex, and education are presented.

Missing data
Items that may have been missed due to sensory or motor impairment, called physical items, were recoded to 0 (i.e. treated as an incorrect answer). Such items include those involving writing or drawing, or those involving visual object recognition (see table 1 for details of items classified as being physical). Furthermore, in the MRC additional items, subjects are asked to recall an address that they have previously been asked to write. If the subject was physically unable to write the address, the instructions were to repeat twice by the interviewer and later the subject would be asked to recall it. There were a large number of missing values for these items, it is likely that interviewers omitted this recall in these individuals because of physical limitations. Therefore, the recall of the written address was categorised as a physical item and missing values were recoded to 0.
In general, individuals with items completely missing on the MMSE or the EMSE were not given scores for these tests. Many people were missing only one or two items from the MMSE or from the whole EMSE. Those missing two or fewer questions had their missing values recoded to 0 and were included in the analysis to ensure maximum use of the available data.
A sensitivity analysis to this missing data assumption has been undertaken using a pro-rata missing value score for the physical items. Individuals with missing data on physical items had a score generated for the proportion correct for their total score removing the physical question items from both the numerator and denominator.

Factors influencing missing scores
In order to have a clear picture of the sample on whom we have cognitive data, it is important to compare them with the sample from whom we were unable to obtain data. The interview was designed such that there was a small subset of questions deemed 'priority' questions to be answered by all individuals, if at all possible. These included the AGECAT organicity screen and the MMSE items. The interviewer could request 'priority mode' at any time or it was selected automatically if an individual was not orientated to time or place. Hence, there are missing data by design that need further investigation.
Several potential factors were investigated to describe the differences between those individuals who had a complete EMSE score, those who only had a complete MMSE score, and those who had neither test complete. These factors included the demographic variables, gender, age, education and social class (as defined either by the respondent's current or last occupation, or for some women, by their husband's current or last occupation).
Other factors included dementia status, whether the subject appeared to be muddled, and whether the interview went into priority mode or had to be abandoned. Physical health was also analysed in relation to missing data, and included ADL impairment using the Townsend disability scale [18]. Interviewer-reported language problems or speech impairment, and an interviewer and self-reported evaluation of hearing impairment, visual impairment, and whether or not the subject was chairbound or bedfast were also included. Self-reported health problems including heart attack, transient ischaemic event, stroke, diabetes, Parkinson's disease, angina as measured by the Rose angina questionnaire [19], smoking status, and global selfreported health (excellent/good/fair/poor) were analysed.

Statistical Methods
Scores on the MMSE, and to a lesser degree the EMSE, do not follow a normal distribution. Hence, medians and other percentiles have been provided. For completeness, a logarithmic transformation (log (31-MMSE) or log(61-EMSE)) has been calculated and the estimate and reference ranges have been back-transformed to the original scale. Version 6.2 of the CFAS data has been used in this analysis. The analysis has been undertaken using STATAversion 8 [20].

Results
The total number of individuals screened was 13,004, representing a response rate of 80%.  Figure 3. This figure depicts the joint relationship between the high end of the MMSE scores (20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30) Tables 3 and 4 present normative values for the MMSE and EMSE respectively, by age group and sex. All normative values are for individuals without dementia. There is a marked effect of age, with older subjects performing more poorly on both tests. The effect is particularly evident for the lowest percentiles (5 th ,10 th ) of the distribution, and for two standard deviations below the mean of the distributions. There is also a modest effect of gender, with women obtaining lower scores, which is particularly marked in the oldest age groups.
The effect of education on cognitive performance can be seen in Tables 5 and 6 where scores are broken down by age group, sex and level of education. Level of education has a marked effect on both MMSE and EMSE scores for all age groups and for both sexes. Therefore, when education level is known, users of these tables are advised to consult Tables 5 and 6, as they provide a better estimate of the individual's expected level of cognitive ability.
Using selected cut-points from these normative tables, we examined how well these values were able to differentiate between demented and non-demented groups. These data are presented in Tables 6, 7, 8 and 9. Table 8 examines absolute cut-points, without taking account of sociodemographic characteristics. The next table presents the results adjusted for age and sex, which is useful for cases where the level of educational attainment is unknown. Table 8 presents the data adjusted for age, sex and education. From these tables, it can be seen that the EMSE is no better than the MMSE at distinguishing demented from non-demented individuals. Roughly the same percentage of demented subjects fall below the 5 th percentile, the 10 th percentile, 1 standard deviation of the mean, and 2 standard deviations of the mean for the MMSE and the EMSE. Furthermore, when comparing three tables, we find that adjusting for age and sex makes little difference to the percentage of demented subjects who fall below the given cut-points. Further adjusting for education seems to have no added benefit in this context. This suggests that the EMSE is primarily extending the description of higher functioning individuals rather than discriminating between the low functioning groups.

Impact of missing data
Excluding the missing physical items responses from both the numerator and denominator caused little change in the results. The median and 5 th percentile MMSE scores were at most one point higher. The EMSE median scores were not affected, however the 5 th percentile was at most 2 points higher.
A comparison of the characteristics of those with complete MMSE scores, those with only complete EMSE scores and those with neither score complete has been undertaken. Those with no complete data or with only the MMSE complete were more likely to be older, female, in a manual occupation, and have a low level of education. They were also far more likely to have been classified as demented at screen. The vast majority of those who only had the MMSE complete (95%) went into priority mode during the screen interview. In general, those who completed both tests had fewer health problems than the other two groups.

Discussion
This paper presents the full normative values for the MMSE in a UK population sample aged 65 years and over, together with normative values for an extended cognitive assessment (EMSE), with a more complete coverage of cognitive domains than the MMSE and a wider difficulty range. The normative values have been calculated for the whole population sample, excluding those with probable or diagnosed dementia. The MMSE norms adds to the existing literature, in both English and other languages, providing norms based on the largest study to date for this age group [21][22][23][24].
This paper also presents normative values for a new scale -the Extended Mental State Exam (EMSE) which combines the MMSE with additional items recommended by a MRC Alzheimer's Disease Workshop [11]. Results show that the EMSE is more normally distributed than the MMSE and avoids the ceiling effect known to impair the MMSE scores in the demented and the non-demented We have examined how well selected MMSE and EMSE values differentiate between individuals with and without dementia. Both perform moderately well for this purpose. Like the EMSE, the Modified Mini-Mental State (3MS) Examination [4,25] extends the coverage of the MMSE and produces a much wider range of scores. However, the 3MS does not cover one of the domains of cognitive function required for a diagnosis of dementia, i.e. perception, and the additional items are geared more towards extending the low end of the ability range than the high end.
Although the 3MS incorporates all the MMSE items, some have been modified, which makes it difficult to compare them with standard MMSE scores (see Introduction). In contrast, the EMSE incorporates the standardised field survey version of the MMSE [10] and the additional items are also presented in a standard way, thus enhancing the comparability between the EMSE and other measures. However unlike the EMSE the additional questions in the 3MS have been shown to assist in differentiating between individuals with and without dementia [9].
The values of the population norms were affected by level of education as in other studies [21,22,26] however a somewhat unexpected finding was that using age-sex and education cutpoints did not improve the discrimination EMSE scores in the demented and non-demented  between the normal and non-demented groups, as found previously using the 3MS [27], but in contrast to other studies that have used MMSE [28,29].
Other researchers have investigated the use of the MMSE in individuals with physical impairments and suggested improvements [30], but in this large sample we did not find that adjusting for missing had much impact on the distributions for either the MMSE or EMSE due to the coding of the physical items. This effect has been seen with MMSE in other studies [31].
The results show that the EMSE is comparable to the MMSE in its ability to differentiate between individuals with and without dementia. However, as described above, the EMSE is superior at providing data for individuals at the high end of the performance range, in a similar way to other tests (e.g. TICS-M and Hopkins Verbal Learning Test [32,33]).
The choice of a screening test for dementia depends on the type of population to be screened, and the aim of the screening procedure. If the population to be screened can be expected to perform poorly on a cognitive function measure (e.g. hospital patients), then we believe that the EMSE has few advantages over the shorter MMSE. This is also true if the aim of screening is to pick up definite cases of dementia. However if the purpose of the cognitive test is to examine whether individuals have early cognitive changes or mild cognitive impairment (MCI) or when the population to be measured includes many high performing individuals (e.g. population surveys), then the EMSE   Percentiles for the MMSE and EMSE by age Figure 4 Percentiles for the MMSE and EMSE by age   pcntle Percentile of the distribution; SD Standard deviation of the mean applied longitudinally has distinct advantages over the MMSE, and the additional 3 minutes of administration time may be regarded as worthwhile.
The data reported here are from the first cross-sectional wave of the MRC Cognitive Function & Ageing Study [12]. The EMSE has also been administered at later waves of the study, and later papers will examine longitudinal aspects of the EMSE and its ability to detect new cases of dementia.

Conclusion
Population-derived norms are valuable for comparing an individual's score to the score that would be expected among the general population, given the individual's specific demographic characteristics.

Role of funding source
The funding bodies have had no influence on the paper or decision to publish.

Conflict of interest
The author(s) declare that they have no competating interests.

Contributions
FH oversaw the clinical context of the paper, SC undertook the initial statistical analysis and co-wrote the paper, FM oversaw the analysis, undertook the final analysis and co-wrote the paper. MRC CFAS investigators undertook the fieldwork and oversaw the scientific integrity of the study and commented on the paper. All authors have seen and approved the final draft of the paper.