Fixed-Effects Regression in Panel Data Analysis using Stata

In panel data, we use fixed-effects model whenever we are only interested in analyzing the impact of variables that vary over time. This model is “designed to study the causes of changes within an entity. A time-invariant characteristic cannot cause such a change, because it is constant for each entity” (Kohler and Kreuter. 2008).

Fixed-effects model explores the relationship between independent variable and dependent variable within an entity as province in our empirical study. Each entity has its own individual characteristics as independent variables, that may or may not influence the dependent variable.

When using Fixed-effects model, we assume that something within the individual may impact or bias the independent variables and we need to control for this. This is the rationale behind the assumption of the correlation between entity’s error term and independent variables. Fixed-effects model remove the effect of those time-invariant characteristics so we can assess the net effect of the independent variables on the dependent one.

Another important assumption of the Fixed-effects model is that those time-invariant characteristics are unique to the individual and should not be correlated with other individual characteristics. Each entity is different therefore the entity’s error term and the constant (which captures individual characteristics) should not be correlated with the others. If the error terms are correlated, then Fixed-effects model is no suitable since inferences may not be correct, and we need to consider the random-effects model, this is the main rationale for the Hausman test.

The equation for the fixed effects model becomes:

Yit = αi + βiXit + uit

Where:

• αi (i=1….n) is the unknown intercept for each entity (n entity-specific intercepts).
• Yit is the dependent variable (DV) where i = entity and t = time.
• Xit represents one independent variable (IV),
• β1 is the coefficient for that IV,
• uit is the error term

“The key insight is that if the unobserved variable does not change over time, then any changes in the dependent variable must be due to influences other than these fixed characteristics.” (Stock and Watson, 2003, p.289-290). “In the case of time-series cross-sectional data the interpretation of the beta coefficients would be “…for a given country, as X varies across time by one unit, Y increases or decreases by β units” (Bartels, Brandom, “Beyond “Fixed Versus Random Effects”: A framework for improving substantive and statistical analysis of panel, time-series cross-sectional, and multilevel data”, Stony Brook University, working paper, 2008). Fixed-effects will not work well with data for which within-cluster variation is minimal or for slow changing variables over time.

Another way to see the fixed effects model is by using binary variables. So the equation for the fixed effects model becomes:

You could add time effects to the entity effects model to have a time and entity fixed effects regression model:

Control for time effects whenever unexpected variation or special events my affect the outcome variable.

A note on fixed-effects that: “…The fixed-effects model controls for all time-invariant differences between the individuals, so the estimated
coefficients of the fixed-effects models cannot be biased because of omitted time-invariant characteristics…[like culture, religion, gender, race, etc]. One side effect of the features of fixed-effects models is that they cannot be used to investigate time-invariant causes of the dependent variables. Technically, time-invariant characteristics of the individuals are perfectly collinear with the person [or entity] dummies. Substantively, fixed-effects models are designed to study the causes of changes within a person [or entity]. A time-invariant characteristic cannot cause such a change, because it is constant for each person.” (Kohler, Ulrich, Frauke Kreuter, Data Analysis Using Stata, 2nd ed., p.245).

Our panel data used in the video, that you can download here in Stata datasheet or Excel data, includes 434 year-observations of 62 provinces as entities of our sample; each province has 7 year-observations. These data were collected from the statistical yearbooks of Vietnam’s provinces during the period from 2010 to 2016; then cleaned by eliminating some missing-data provinces and year-observations.