Skip to main content
Figure 6 | BMC Geriatrics

Figure 6

From: Indicators of "Healthy Aging" in older women (65-69 years of age). A data-mining approach based on prediction of long-term survival

Figure 6

Forward variable selection. The best bivariate model was based upon the number of step-ups completed by a subject in 10 seconds and whether a subject previously smoked (average C = 0.614). Variables were iteratively added to this model to evaluate concordance values associated with larger models. At each iteration, given a baseline model with p variables, the concordance associated with all possible models containing p + 1 variables was evaluated (based upon 20 cross-validation simulations). The best model containing p + 1 variables was chosen as a new baseline model and the process repeated. Points in part (A) show the mean concordance index associated with each of the models created by this process, and upper and lower lines indicate 95% confidence limits. (B) The size of a tentative model was chosen based upon the value of p that minimized a loss function. The point of diminishing returns with increasing p corresponds to a "knee" or leveling off point of the curve shown in part (A). To quantitatively locate this point, the scales shown in part (A) were mapped to the interval [0,1], and a loss function was defined as the distance between the plotted curve and the extreme upper-left corner of the coordinate system. The value of p that minimized this loss function is denoted by the dashed vertical lines in parts (A) and (B) (i.e., p = 13 variables).

Back to article page