Using Frequency Weights with Stata

summarize, tabulate, table and related commands can be used with frequency weights that indicate the number of replicated observations. For example, here are the mean and other statistics for per capita electricity use across all U.S. states.

This mean, 13,318 kWh, tells us the average electricity across the 51 states (including District of Columbia) — counting each state as one unit. Wyoming has the smallest population (564 thousand) and the highest per capita electricity consumption (27,457 kWh). California has the largest population (37 million) and the lowest per capita electricity consumption (6,721 kWh). Each has equal weight in this 51-state mean. To see the per capita mean for the U.S. as a whole, however, we need to weight by population.

The population-weighted mean (12,114 kWh) is lower than the mean of the 51 states (13,318 kWh) because more people live in relatively low-consuming states such as California and New York than in high-consuming states such as Wyoming and Kentucky (Figure 5.3).

The population-weighted mean, unlike the unweighted mean, can be interpreted as a mean for the 309 million people in the U.S. Note, however, that we could not make similar statements for the weighted standard deviation, minimum or maximum. Apart from the mean, most individual- level statistics cannot be calculated simply by weighting data that already are aggregated. Thus, we need to use weights with caution. They might make sense in the context of one particular analysis, but seldom do for the dataset as a whole, when many different kinds of analyses are needed.

Frequency weights operate in a similar way with both tabulate and table. The following table command calculates population-weighted means for each census division. Now that we take its much larger population into account, we see that the Pacific division has lower mean electricity consumption than New England.

The row option called for a final row summarizing the table as a whole. The overall mean in this table (12,112.7 kWh) is the same as that found earlier by summarize.

Source: Hamilton Lawrence C. (2012), Statistics with STATA: Version 12, Cengage Learning; 8th edition.

Leave a Reply

Your email address will not be published. Required fields are marked *