Two-Step System GMM (Generalized Method of Moments)

 The Two-Step System GMM is a dynamic panel data estimation technique designed to address:

  • Endogeneity of explanatory variables

  • Unobserved heterogeneity (fixed effects)

  • Autocorrelation and heteroskedasticity

It improves upon simpler estimators (like OLS or FE) and is particularly useful when the model includes lagged dependent variables.


⚙️ System GMM: The Basics

Developed by Arellano & Bover (1995) and Blundell & Bond (1998), System GMM combines:

  1. Difference GMM (Arellano & Bond, 1991):
    First-differences the model to remove fixed effects and uses lagged levels as instruments.

  2. Level GMM (Blundell & Bond):
    Uses lagged differences as instruments for the level equation to improve efficiency.


🧮 The Dynamic Panel Model

A typical dynamic panel data model is:

yit=αyit1+βXit+μi+εity_{it} = \alpha y_{it-1} + \beta X_{it} + \mu_i + \varepsilon_{it}

Where:

  • yity_{it} is the dependent variable

  • yit1y_{it-1} is the lagged dependent variable

  • XitX_{it} are explanatory variables

  • μi\mu_i = unobserved fixed effect

  • εit\varepsilon_{it} = idiosyncratic error


🔁 Why “Two-Step”?

First Step:

  • Uses an initial weighting matrix assuming homoskedastic errors.

  • Generates consistent but inefficient estimates of parameters and residuals.

Second Step:

  • Uses residuals from Step 1 to build a robust weighting matrix accounting for heteroskedasticity and autocorrelation.

  • Produces efficient estimates and valid standard errors.

🔺 However, standard errors in Two-Step GMM tend to be downward biased, so Windmeijer-corrected standard errors are commonly used.


🧰 Key Features and Assumptions

FeatureDescription
InstrumentsLagged levels and lagged differences of variables
EndogeneityCan instrument endogenous variables
Fixed EffectsRemoved via differencing
AutocorrelationAssumes no second-order serial correlation in errors
HeteroskedasticityRobust in the second step

📊 When to Use System GMM

  • Panel data with small time dimension (T) and large cross-section (N)

  • Presence of endogenous regressors

  • Need to include lagged dependent variables

  • Data with individual effects and potential measurement error


🧾 Output Diagnostics

TestPurpose
Hansen Test / Sargan TestValidity of instruments (overidentifying restrictions)
Arellano-Bond AR(1), AR(2) TestChecks for serial correlation in differenced residuals

🔍 Comparison with Other Estimators

EstimatorHandles Endogeneity?Lagged DV?Efficiency
Pooled OLSLow
Fixed EffectsMedium
Difference GMMMedium
System GMM✅✅✅✅ (especially in Two-Step)

📌 Final Notes

  • Overfitting with too many instruments can weaken GMM results. Rule of thumb: number of instruments < number of groups.

  • Best suited for macro panels like bank performance, firm-level profitability, or investment behavior over time.

Comments

Popular posts from this blog

Shodhganaga vs Shodhgangotri

50 interview questions with answers on Non-Parametric Tests