Kaplan–Meier Survivor Functions by using Stata

Let n _t represent the number of observations that have not failed, and are not censored, at the beginning of time period t. d_t represents the number of failures that occur to these observations during time period t. The Kaplan-Meier estimator of surviving beyond time t is the product of survival probabilities in t and the preceding periods:

For example, in the AIDS data seen earlier, one of the 51 individuals developed symptoms only one month after diagnosis. No observations were censored this early, so the probability of “surviving” (meaning, not developing AIDS) beyond time = 1 is

S(1) = ( 51 – 1) / 51 = .9804

A second patient developed symptoms at time = 2, and a third at time = 9:

S(2) = .9804 x (50 – 1) / 50 = .9608 S(9) = .9608 x ( 49 – 1) / 49 = .9412

Graphing S(t) against t produces a Kaplan-Meier survivor curve, like the one seen in Figure 10.1. Stata draws such graphs automatically with the sts graph command. For example,

For a second example of survivor functions, we turn to data in smokingl.dta, adapted from Rosner (1995). The observations are data on 234 former smokers, attempting to quit. Most did not succeed. Variable days records how many days elapsed between quitting and starting up again. The study lasted one year, and variable smoking indicates whether an individual resumed smoking before the end of this study (smoking = 1, “failure”) or not (smoking = 0, “censored”). With new data, we should begin by using stset to set the data up for survival-time analysis.

. use C:\data\smoking1.dta, clear

. describe

The study involved 110 men and 124 women. Incidence rates for both sexes appear to be similar: . stsum, by(sex)

Figure 10.2 confirms this similarity. There appears to be little difference between the survivor functions of men and women. That is, both sexes returned to smoking at about the same rate. The survival probabilities of nonsmokers decline very steeply during the first 30 days after quitting. For either sex, there is less than a 15% chance of surviving as a nonsmoker beyond a full year.

. sts graph, by(sex) plot1opt(lwidth(medium)) plot2opt(lwidth(thick))

We can also formally test for the equality of survivor functions using a log-rank test. Unsurprisingly, this test finds no significant difference (p = .6772) between the smoking recidivism of men and women.

Source: Hamilton Lawrence C. (2012), Statistics with STATA: Version 12, Cengage Learning; 8th edition.

Leave a Reply Cancel reply