This article introduces the practical process of choosing Fixed-Effects, Random-Effects or Pooled OLS Models in Panel data analysis. We will show you how to perform step by step on our panel data, from which we published the results in our article on Sustainability review in 2019 (see Nguyen Hoang Viet, Phan Thanh Tu and Lobo Antonio, 2019). You can see the theoretical difference of regression models with Panel data (fixed-effects, random-effects, and pooled OLS) in the previous article.
Our panel data used in this article, that you can download here in Stata datasheet or Excel data, includes 434 year-observations of 62 provinces as entities of our sample; each province has 7 year-observations. These data were collected from the statistical yearbooks of Vietnam’s provinces during the period from 2010 to 2016; then cleaned by eliminating some missing-data provinces and year-observations.
In this sample, “id” represents the entities as Vietnam provinces that we code them in number; and “year” represents the time variable (t). Note that you should make attention for assuring that all data of one thing such as one entity are coded exactly the same. If not, Stata will count as another thing or ignore it.
The objective of our research aims to study the relationship between foreign direct investment (FDI) and sustainability at provincial level in a developing host country as Vietnam for the period between the years of 2010 and 2016.
Here are the variables of our research; in which Dependent variable is the adjusted net savings that assess the sustainable development of Vietnam provinces. In total, we have 11 independent variables that are distinguished in three groups, including: 3 variables associated with the FDI inflow stocks; 5 variables associated with the employment in FDI sector, and 3 variables associated with the performance of FDI in provinces. And, 2 control variables are size and economic growth of province.
Practical regression process
Now, we apply the process of selecting the regression model for panel data (between Pooled OLS Model, Random-Effects Model and Fixed-Effects Model) of Dougherty (2011) for our panel data of Vietnam provinces in period from 2010 to 2016.
Source: Dougherty (2011, p.421)
For the first step, our sample can be considered as random sample because of our choice in time-span and FDI sector at provincial level in Vietnam.
So, we go into the second step of the Process of choosing regression model for panel data, in which we perform both fixed effects and random effects regressions by using Stata.
The Stata command to run fixed/random effects is xtreg.
Before using xtreg you need to set Stata to handle panel data by using the command xtset. Type: xtset Id Year, yearly. Note that Stata distinguishes capital letters, so you must type exactly the variable name. Or you can click this command on the Stata’s Menu by avoiding typing errors.
In this case, “Id” represents the entities that is Vietnam provinces; and “Year” represents the time variable t.
As the panel data has been handled, we can now run the fixed-effects model by using the Stata command xtreg with dependent variable ANS and 13 variables, including 11 independent ones and 2 control variables in our panel data.
Type: xtreg ANS FDIENT FDICAP FDIAST FDIEMP FDICOM FDIWAG FDITUR FDIGDP FDIROTC FDIROFA FDIROS Size GDPgrowth, fe.
Or you can click this command on the Stata’s Menu by avoiding typing errors. Note that the option fe should be chosen for the fixed-effects model.
To compare the results with random-effects model that will be performed later; we must now store the results with fixed-effects regression by using the command “estimates store fixed”.
Then, we run the random-effects model by using the Stata command xtreg with the same variables by choosing the option re
Type: xtreg ANS FDIENT FDICAP FDIAST FDIEMP FDICOM FDIWAG FDITUR FDIGDP FDIROTC FDIROFA FDIROS Size GDPgrowth, re.
Or you can click this command on the Stata’s Menu by avoiding typing errors.
Also, we save the estimates by using the command “estimates store random”.
For comparing fixed and random-effects models, we perform now the Hausman test by typing” hausman fixed random
The negative sign can arise if different estimates of the error variance are used in forming variance of b and variance of capital B. In that case, we need to use the sigmamore option, which specifies that both covariance matrices are based on the (same) estimated disturbance variance from the efficient estimator.
Type: hausman fixed random, sigmamore
By focusing on the DWH test, we determine whether there are significant differences in the coefficients. This significant Hausman test allow us to accept the null hypothesis by indicating that the Fixed-effects model is appropriate.
By caution, it is necessary to test the presence of random effects by using Breusch-Pagan Lagrange multiplier. We can see that the result of this test is significant by indicating random effects and refusing the Pooled OLS model.
As the Hausman test has eliminated the random-effects model; and Lagrange multiplier has refused the Pooled OLS model. We select with confidence now Fixed-effects one.
We must check the Heteroskedasticity test for the selected Fixed-effects model by using the command xttest3. This is a user-written program, to install it type: ssc install xtest3
Because, Stata stored recently the results of random-effects model; we rerun the fixed-effects regression; then run Heteroskedasticity test. The null is homoskedasticity.
And our significant test rejects the null and indicates that our Fixed-effects model has a heteroskedasticity problem.
Hence, we use the option robust to correct for this regression model by typing the xtreg with 2 options such as robust and fe.
Finally, the robust Fixed-effects model is used for assessing the proposed research hypotheses in our research.
Please see the detail results and analysis in our article Nguyen Hoang Viet, Phan Thanh Tu and Lobo Antonio (2019)