Logistic Regression with Ordered-Category y by using Stata

logit and logistic fit models for variables with two outcomes, coded 0 and 1. We need other methods for models in whichy takes on more than two values. Two important possibilities are ordered and multinomial logistic regression.

ologit   Ordered logistic regression, where y is an ordinal (ordered-category) variable. The numeric values representing the categories do not matter, except that higher numbers mean “more.” For example, the y categories might be {1 = “poor,” 2 = “fair,” 3 = “excellent”}.

mlogit   Multinomial logistic regression, where y has multiple but unordered categories such as {1 = “Democrat” 2 = “Republican,” 3 = “other”}.

Ify is {0,1}, logit, ologit and mlogit all produce essentially the same estimates.

We earlier simplified the three-outcome ordinal variable distress into a dichotomy, any. logit and logistic require {0,1} dependent variables. ologit, on the other hand, is designed for ordinal variables that have more than two values. Recall that distress has outcomes 0 = “none,” 1 = “1 or 2,” and 2 = “3 plus” incidents of booster-joint distress.

Ordered logistic regression indicates that date and temp both affect distress, with the same signs (positive for date, negative for temp) seen in our earlier binary logit analyses:

Likelihood-ratio tests are more accurate than the asymptotic z tests shown. First, have estimates store preserve in memory the results from the full model (with two predictors) just estimated. We can give this model any descriptive name, such as datetemp.

. estimates store date_temp

Next, fit a simpler model without temp, store its results with the name notemp, and ask for a likelihood-ratio test of whether the fit of reduced model notemp differs significantly from that of the full model datetemp:

The lrtest output notes its assumption that model notemp is nested in model date temp, meaning that the parameters estimated in notemp are a subset of those in date temp, and that both models are estimated from the same pool of observations (which can be tricky when the data contain missing values). This likelihood-ratio test indicates that notemp’s fit is significantly poorer. Because the presence of temp as a predictor in model date temp is the only difference, the likelihood-ratio test thus informs us that temp’s contribution is significant. Similar steps confirm that date also has a significant effect.

The estimates store and lrtest commands provide flexible tools for comparing nested maximum-likelihood models. Type help lrtest and help estimates for details and options.

The ordered-logit model estimates a score, S, for each observation as a linear function of date and temp:

S = .003286date – .1733752temp

Predicted probabilities depend on the value of S, plus a logistically distributed disturbance u, relative to the estimated cut points (shown in ologit output as cutl, cut2 etc.).

After ologit, predict calculates predicted probabilities for each category of the dependent variable. We supply predict with names for these probabilities. For example: none could denote the probability of no distress incidents (first category of distress), onetwo the probability of 1 or 2 incidents (second category of distress), and threeplus the probability of 3 or more incidents (third and last category of distress):

. quietly ologit distress date temp

. predict none onetwo threeplus

This creates three new variables:

Our model, based on the analysis of 23 pre-Challenger shuttle flights, predicts little chance (p = .000075) of Challenger experiencing no booster joint damage, a scarcely greater chance of one or two incidents (p = .0003), but virtual certainty (p = .9996) of three or more damage incidents.

See Long (1997) or Hosmer and Lemeshow (2000) for detailed presentations of this and related techniques. The Base Reference Manual explains Stata’s implementation. Long and Freese (2006) provide additional Stata-focused discussion, and make available their ado-files for some useful interpretation and postestimation commands, such as Brant tests. To install these unofficial, free ado-files from the Web, type findit brant and follow the link under Web resources.

Source: Hamilton Lawrence C. (2012), Statistics with STATA: Version 12, Cengage Learning; 8th edition.

Leave a Reply

Your email address will not be published. Required fields are marked *