Using Weights in Stata

Stata understands four types of weighting:

aweight   Analytical weights, used in weighted least squares (WLS) regression and similar procedures.

fweight    Frequency weights, counting the number of duplicated observations. Frequency weights must be integers.

iweight    Importance weights, however you define importance.

pweight   Probability or sampling weights, proportional to the inverse of the probability that an observation is included due to sampling strategy.

Not all types of weighting have been defined for all types of analyses. We cannot, for example, use pweight with the tabulate command. Using weights effectively requires a clear understanding of what we want them to accomplish in a particular analysis.

Weights have many statistical applications, including methods of compensating for originally disproportionate or complex sampling designs — a common feature of surveys. pweight provides one way to adjust for sampling bias, using probability weights proportional to 1/(probability of selection). Analysis of survey data using probability weights is a particular strength of Stata, introduced in Chapter 4.

In some instances, weighting involves something simpler — an aggregate dataset in which the variables are statistics summarizing many individual observations. For example, dataset Nations2.dta contains United Nations human-development indicators that characterize living conditions in 194 nations.

The mean life expectancy is 68.7 years:

The mean above represents the average life expectancy for the 194 nations in the sample, rather than the average life expectancy for the 7 billion people who live in those nations. That is, it weights the life expectancy of the smallest nation (Tuvalu, a Pacific island nation with about 10,000 people) the same as the life expectancy of the largest (China, population about 1.3 billion). Using population as a frequency weight, we get a better estimate of the mean life expectancy for all 7 billion people.

Probability weights (pweight) get more attention in Chapter 4. Analytical weights (aweight) are useful in graphing (Chapter 3) and for weighted least squares (Chapters 7, 8), among other things. Importance weights (iweight) have no fixed definition, but could be applied in programs written for special purposes.

Source: Hamilton Lawrence C. (2012), Statistics with STATA: Version 12, Cengage Learning; 8th edition.

Leave a Reply

Your email address will not be published. Required fields are marked *