Generalized Linear Models by using Stata

Generalized linear models (GLM) have the form

where    g[ ] is the link function and F the distribution        family. This general formulation encompasses many specific models. For example, if g[ ] is the identity function and y follows a normal (Gaussian) distribution, we have a linear regression model:

If g[ ]     is the logit function and y follows a Bernoulli distribution, we have    logit   regression instead:

Because of its broad applications, GLM could have been introduced at several different points in this book. Its relevance to this chapter comes from the ability to fit event models. Poisson regression, for example, requires that g[ ] is the natural log function and thaty follows a Poisson distribution:

As might be expected with such a flexible method, Stata’s glm command permits many different options. Users can specify not only the distribution family and link function, but also details of the variance estimation, fitting procedure, output and exposure. These options make glm a useful alternative even when applied to models for which a dedicated command (such as regress, logistic or poisson ) already exists.

We might represent a generic glm command as follows:

. glm y x1 x2 x3, family(familyname) link(linkname) exposure(expvar) eform vce(jackknife)

Chapter 7 began with the simple regression of life expectancy on the mean years of schooling in 188 nations:

We can fit the same model and obtain the same estimates via a glm command, . glm life school, link(identity) family(gaussian)

Because link(identity) and family(gaussian) are default options, we could have left them out of the previous glm command.

We could also fit the same OLS model but obtain bootstrap standard errors.

The bootstrap standard errors reflect observed variation among coefficients estimated from 50 samples of n = 188 cases each, drawn by random sampling with replacement from the original n = 188 dataset. In this example, the bootstrap standard errors are less than the corresponding theoretical standard errors, and the resulting confidence intervals are narrower.

Similarly, we could use glm to repeat the logistic regression with the space shuttle data that began Chapter 9. For this example we ask for jackknife standard errors and odds ratio or exponential-form (eform) coefficients: . use C:\data\shuttle0.dta, clear

Although glm can replicate the models fit by many specialized commands, and adds some new capabilities, the specialized commands have their own advantages including speed and customized options. A particular attraction of glm is its ability to fit models for which Stata has no specialized command.

Source: Hamilton Lawrence C. (2012), Statistics with STATA: Version 12, Cengage Learning; 8th edition.

1 thoughts on “Generalized Linear Models by using Stata

Leave a Reply

Your email address will not be published. Required fields are marked *