Outline
Applied Econometrics with R • R and econometrics
• Robust standard errors
Christian Kleiber Achim Zeileis Example: Sandwich variance estimators for a tobit model
Universität Basel Wirtschaftsuniversität Wien
Switzerland Austria • Gaps
• AER: book and package
Christian Kleiber 1 U Basel
R and econometrics Spinograms and GLMs
1.0
1.0
• Language and terminology in econometrics is somewhat distinct from the terminol-
0.8
0.8
no
ogy used in mainstream statistics.
no
participation
participation
0.6
0.6
Two examples:
0.4
0.4
Statistics Econometrics
yes
factor dummy variables
0.2
0.2
yes
generalized linear model probit, logit, ...
0.0
0.0
• Generally, not much awareness of statistical GLM literature among econometricians. 0 6 8 10 12 2 3 3.5 4 4.5 5 6
education age
• Visualization not very common.
Christian Kleiber 2 U Basel Christian Kleiber 3 U Basel
R and econometrics R and econometrics
Traditional econometric software Why R?
Applied econometrics: • superior graphics
EViews, TSP, PcGIVE, SAS, Stata, ...
• object orientation
Theory and methodology: • reproducibility
GAUSS, Ox, Matlab, S-PLUS, ...
Christian Kleiber 4 U Basel Christian Kleiber 5 U Basel
R and econometrics Robust standard errors
• Linear models and extensions: OLS, nonlinear regression, systems of equations
• Diagnostics and validation: Robust regression, sandwich covariance matrices, In the linear regression model
diagnostic tests
• Microeconometrics: Logit, Probit, Poisson regression (via glm()), Tobit, modi- yi = x>
i β + εi ,
fied count data models (ZIP, hurdle), duration models (package survival)
we have for OLS, under technical assumptions,
• Time series: (S)ARIMA(X), unit roots and cointegration (packages tseries,
urca), structural change, ARCH models (see Rmetrics), structural time series √ d
models n(β̂ − β) −→ N (0, σ 2Q−1
XX )
See also CRAN task view: econometrics at provided model is correctly specified.
[Link]
Christian Kleiber 6 U Basel Christian Kleiber 7 U Basel
If only conditional mean is correctly specified, we have Robust standard errors
√ d
n(β̂ − β) −→ N (0, Q−1 −1
XX Σv QXX )
Example: “Fair’s affairs” (Fair, J. Political Economy 1978)
a sandwich variance formula. Robustness considerations suggest to estimate the latter.
In econometrics usually called “White standard errors” or “heteroskedasticity-consistent • Cross-section data on frequency of extramarital affairs from a survey conducted by
(HC) standard errors”. Psychology Today in 1969.
Depending on the context, this is also known as Eicker-White, Huber-White, Eicker- • n = 601, dependent variable is number of extramarital affairs, covariates are gender,
Huber-White ... age, years married, children, religiousness, education, occupation, rating of marriage.
The matrix Σv = Cov(vi) = Cov(xiεi) = E(ε2i xix> i ) comes from an estimating • 75.04% of the respondents do not report any extramarital affairs.
equation. Idea generalizes to GLMs and many other models.
R provides infrastructure for HC (and also HAC) covariances in the sandwich package. Data will be available in package AER.
Christian Kleiber 8 U Basel Christian Kleiber 9 U Basel
Robust standard errors Robust standard errors
R> fm_tobit <- tobit(affairs ~ age + yearsmarried + religiousness +
+ occupation + rating, data = FairAffair)
Classical Tobit model (Tobin 1958) is model for left-censored (at zero) data. Standard R> coeftest(fm_tobit)
approach employs Gaussian MLE.
z test of coefficients:
In R, this can be fitted (easily) using survreg() from the survival package, see
example("tobin") there. Estimate Std. Error z value Pr(>|z|)
(Intercept) 8.1742 2.7414 2.98 0.0029
New function tobit() in package AER provides convenience interface to survreg() age -0.1793 0.0791 -2.27 0.0234
(and a bit more). yearsmarried 0.5541 0.1345 4.12 3.8e-05
Task: standard errors under weaker assumptions. religiousness -1.6862 0.4038 -4.18 3.0e-05
occupation 0.3261 0.2544 1.28 0.2000
rating -2.2850 0.4078 -5.60 2.1e-08
Log(scale) 2.1099 0.0671 31.44 < 2e-16
Christian Kleiber 10 U Basel Christian Kleiber 11 U Basel
R> coeftest(fm_tobit, vcov = sandwich) Robust standard errors
Can also do
z test of coefficients:
R> [Link](fm_tobit, "age = 0", vcov = sandwich)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 8.1742 3.0779 2.66 0.0079
age -0.1793 0.0889 -2.02 0.0437 Linear hypothesis test
yearsmarried 0.5541 0.1372 4.04 5.3e-05
religiousness -1.6862 0.3999 -4.22 2.5e-05 Hypothesis: age = 0
occupation 0.3261 0.2460 1.33 0.1850
rating -2.2850 0.3935 -5.81 6.4e-09 [...]
Log(scale) 2.1099 0.0548 38.48 < 2e-16
[Link] Df Chisq Pr(>Chisq)
1 594
2 595 -1 4.07 0.044
Christian Kleiber 12 U Basel Christian Kleiber 13 U Basel
Gaps AER: Book and package
• dynamic regressions (but see packages dyn and dynlm) Christian Kleiber and Achim Zeileis: Applied Econometrics with R, Springer-Verlag,
• multiple time series models (structural VARs, ...) New York, 2006 (?)
• nonlinear time series models (TAR, smooth transition models, ...) Contents:
• panel data methods, in particular
• R Basics
– least-squares methods • Linear Regression and Extensions
– dynamic models • Validating Linear Models
– microeconometric models (GLMs with panel data, ...)
• Models of Microeconometrics
• GMM and instrumental variables • Time Series Models
• non- and semiparametric regression • Programming Your Own Analysis
Christian Kleiber 14 U Basel Christian Kleiber 15 U Basel
AER: Book and package
Be sure to attend
Package AER contains more than 60 data sets (with examples) from
• textbooks
– B. Baltagi: Econometrics, 3e
Econometrics and Social Science
– W.H. Greene: Econometric Analysis, 5e
– P.H. Franses: Time Series Models for Business and Economic Forecasting (Spotlights: HS 0.3, Forum: Aula 3)
• data archives of the Journal of Applied Econometrics, Journal of Business and Eco- Friday 16 15:00-18:30
nomic Statistics
• selected further sources (Empirical Economics, PARADE magazine ...)
Christian Kleiber 16 U Basel