Study Guide Notes PDF

Title Study Guide Notes
Course Intermediate Econometrics
Institution City University London
Pages 17
File Size 975.2 KB
File Type PDF
Total Downloads 51
Total Views 142

Summary

Great notes to help achieve a first class...


Description

Know how to test for a linear restriction on regression coefficients T-test statistic Often a hypothesis of economic interest implies a linear restriction on more than one coefficient, such as (constant returns to scale: Cobb-Douglas production function). In general, we can formulate such an hypothesis as: for some scalar value c and k dimensional vector r. Under assumptions A.1-A.4 and A.6 (normality) we can use a t-test to test this hypothesis The test statistic for H₀ becomes:

Example: Consider the Cobb-Douglas production model, yields our linear regression model

Hypothesis of CRS (Constant Returns to Scale):

Test statistic: Acceptance/Rejection Rule: with a significance level �, we reject if

NOTE:

. Taking logs

Know how to test for the joint significance of two or more coefficients This test is a joint test of the hypothesis that all the coefficients except the constant term are zero

The central result need to carry out this test is the F statistic

Large values of F give evidence against the validity of the hypothesis. Note that a large F is induced by a large value of R²

Know how to interpret the coefficients of a regression as ceteris paribus conditions, also distinguishing among a linear model, a log-linear model and a semi-log model Consider the linear regression model under the classical assumptions A.1-A.4 What interpretation do the individual coefficients � have?

In a multiple regression model single coefficients can only be interpreted under ceteris paribus conditions For example, could measure the effect of age on the expected wage of a woman, if the education level and years of experience are kept constant In a log-linear model,

In a semi-log model,

Explain the problems of omitting a relevant variable and including an irrelevant variable To illustrate this, consider the following two models:

First we consider the case of omitted variables, that is, we estimate (2) whereas (1) is the true Data Generating Process (DGP). The OLS estimator we consider:

The properties of this estimator can be obtained by substituting the true model (1)

OLS estimator when omitting relevant variables will typically yield omitted variable bias s² will be biased estimate of the error variance of the true regression error ⟹ the tstatistics, F statistics are invalid

The converse is less of a problem. In that case, we estimate (1) whereas (2) is the true DGP OLS estimator when including irrelevant variables will not have any bias Test statistics will be valid in this case It is preferable to estimate � using the restricted model because the inclusion of irrelevant variables will typically increase the variance of our estimates

Explain the difference between exact multicollinearity and correlation between regressors In general, there is nothing wrong with including variables in your model that are correlated. For example: Experience and schooling Age and experience Inflation rate and nominal interest rate However, when correlations are high, it becomes hard to identify the individual impact of each of the variables. Exact Multicollinearity is a situation in which an exact or approximate linear relationship exists between the explanatory variables. Exact multicollinearity arises if the rank of the regressors matrix X is less than k, the number of regressors, or equivalently if the columns of X are linearly dependent. The result is that X’X is not invertible, so that the ordinary least squares estimator is not unique. In other words, the parameters of such a regression model are unidentified. The use of too many dummy variables (which are either 0 or 1) is a typical cause for exact multicollinearity. Dummy variable trap Exact multicollinearity is easily solved by excluding one of the variables from the model More common is it for the regressors to be correlated. The higher the correlation between the regressors, the less precise our estimates will be

Explain the problem of outliers In calculating the OLS estimator, some observations may have a disproportional impact. An outlier is an observation that deviates markedly from the rest of the sample. Could be due to mistakes or problems in the data. A outlier becomes an influential observation if it has a substantial impact on the estimated regression line.

Be able to explain the maximum likelihood principle and the properties of ML Maximum Likelihood: the intuition From the assumed distribution of the data (e.g. y or y given x), we determine the likelihood of observing the sample we happen to have, as a function of the unknown parameters Next, we choose as our estimates those values for the unknown parameters that give us the highest likelihood In general, this leads to consistent asymptotically normal and efficient estimators for our parameters, provided the (log)likelihood function is correctly specified (i.e. distributional assumptions are correct) The Likelihood Principle The “probability” of observing yt|xt is given by the conditional density f(yt|xt; θ) If the observations are ||D, the probability of y₁,…, yT (given X) is:

The likelihood contribution for observation t is defined as:

and the likelihood function for the sample is defined as

The ML estimator is chosen to maximise L(θ|Y ; X) i.e. to maximise the likelihood of observing the data given the model

Properties of ML Assume that the true model is contained in the statistical model Then under some regularity conditions: The ML estimator is consistent, plim The ML estimator is asymptotically normal

Here

is the asymptotic variance, and the negative expected Hessian is called the information matrix

The more curvature of the likelihood function, the more precision The ML estimator is asymptotically efficient. All other consistent and asymptotically normal estimators will have an asymptotic variance larger than is denoted the Cramer-Rao lower bound

Define the concepts of likelihood function, log likelihood function, score vector and information matrix. Why do we need the information matrix? Easier to maximise the log-likelihood function

Define the score vector as

where st(θ) is the score for each observation The first order conditions, the K so-called likelihood equations, state:

With the help of a graph, be able to intuitively explain the Wald test, the Lagrange multiplier test (score test) and the likelihood ratio test Consider a null hypothesis of interest, H₀, and an alternative, HA. E.g.

Let and denote the ML estimates under H₀ and HA respectively Wald test. Estimates the model only under HA, and look at the distance normalised by the covariance matrix Likelihood ratio (LR) test. Estimate under H₀ and under HA and look at the loss in likelihood Lagrange multiplier (LM) or score test. Estimate under H₀ and see if the FOCs are significantly violated The tests are asymptotically equivalent

Wald Test Recall, that

so that

If the null hypothesis is true Rθ = q and a natural test statistic is :

Under the null this is distributed as An example is the t-ratio for H₀:

Requires only estimation under the alternative HA

Likelihood Ratio (LR) Test

For the LR test we estimate both under H₀ and under HA The LR test statistic is given by:

where

and

are the two likelihood values

Under the null, this asymptotically distributed as:

The test is insensitive to how the model and restrictions are formulated.

Lagrange Multiplier (LM) or score test Let s(.) be the score function of the unrestricted model. Recall that the score is zero at the unrestricted estimate:

If the restriction is true:

This can be tested by the quadratic form:

which under the null is X²(J)

Be able to explain the problems and consequences of heteroskedasticity and autocorrelation The general linear regression model is:

where Ω is a positive definite matrix. Two cases we shall consider in detail are heteroskedasticity and autocorrelation Disturbances are heteroskedastic when variances differ (scale of the dependent variable and the explanatory power of the model may vary over observations)

Autocorrelation. Economic time series often display a “memory” in that variation around the regression function is not independent from one period to the next:

Consequences for the OLS estimator What are the consequences of the OLS estimator errors is dropped, ? remains Unbiased

when the assumption of spherical

since,

Consequently, although the OLS estimator is still unbiased, its routinely computed variance and standard errors will be based on the wrong expression

Thus, standard t-tests and F-tests will no longer be valid and inferences will be misleading Since not all Gauss-Markov conditions are satisfied (E��’|X ≠ �²I) (Efficiency is lost) Under conditions A.1, A.2, A.3’, A.4’, and

is not BLUE

Consistency remains These consequences indicate two ways of handling the problems of heteroskedasticity and autocorrelation 1) Derivation of an alternative estimator that is BLUE 2) Sticking to the OLS estimator but somehow adjusting the standard errors to allow for heteroskedasticity and/or autocorrelation In some cases the presence of heteroskedasticity and autocorrelation may occur because the estimated model is misspecified in one way or the other. Typically, these are: dynamic misspecification, omitted variables, and functional form misspecification

Autocorrelation create some problems for least squares: The least squares estimator is still linear and unbiased but it is not efficient. The formulas normally used to compute the least squares standard errors are no longer correct and confidence intervals and hypothesis tests using them will be wrong

Explain how to construct and run the Breusch-Pagan test and the White test for heteroskedasticity Breusch and Pagan (1979) developed a Lagrange Multiplier (LM) test for heteroskedasticity. Consider the following model:

The test is given by the following steps: 1) Run a regression of the above model and obtain the residuals equation 2) Run the following auxiliary regression:

of this regression

3) Set the null and the alternative hypotheses. The null of homoskedasticity is given:

4) Compute the LM = nR² statistic, where n is a number of observations used in order to estimate the auxiliary regression in 2) and R² is the coefficient of determination of this regression. The LM statistic follows the X² distribution with p — 1 degrees of freedom 5) We reject the null hypothesis of homoskedasticity if LM statistical is bigger than the critical value (LM — stat > )

All the above tests for heteroskedasticity test for deviations from the null of homoskedasticity in particular directions. The White (1980) test does not require additional structure on the alternative hypothesis. Specifically, it tests the general hypothesis of the form

The test realises that if there is no heteroskedasticity

A simple operational version of this test is carried out by computing nR² in the regression on a constant and all (unique) first moments second moments, and cross-products of the original regressors. The test statistic is asymptotically distributed as where p is the number of regressors in the auxiliary regression, excluding the intercept

Be able to explain the two approaches to estimation in the presence of heteroskedasticity and autocorrelation Approach 1: Stick to the OLS estimator, while adjusting the standard errors (does not require us to specify heteroskedasticity) OLS to the generalized linear regression model is unbiased and consistent for �. The appropriate covariance matrix is given by:

The estimator of the covariance matrix (White (1980) is given by:

Standard errors computed as the square root of the diagonal elements of this matrix are usually referred to as heteroskedastic-consistent standard errors or simply White standard errors Approach 2: Use Feasible GLS (specify heteroskedasticity up to some unknown parameters) Replace the unknown with consistent estimates:

This is not as easy as it sounds, since we have n →∞ of them. For consistency, we require to specify the form our heteroskedasticity takes. Simplest case is when we expect , then a consistent estimator of � will be the least squares slopes in the “model” . If, instead, we expect multiplicative heteroskedasticity of the form . In that case consistent estimates of � can be obtained by regressing the “model”

Be able to explain the difference between an AR and an MA autocorrelation structure for the error term First Order Autocorrelation AR(1) The most popular form is known as the first order autoregressive process:

With the condition that |p| < 1, we say that the first-order autoregressive process is stationary. The assumption that |p| < 1 is needed to ensure a finite and positive variance. The situation where |p| = 1 has received a lot of attention in the current literature in macroeconometrics, and is called “unit roots” The covariance matrix in this case is:

Aside: Given stationarity it is easy to derive the mean and covariance matrix

Higher Order Autocorrelation AR(p)

In macro-economic time series models and in most cases our errors exhibit autocorrelation of order one. When we have quarterly or monthly data it is possible that there is a periodic (quarterly or monthly) effect that is causing the errors across the same periods but in different years to be correlated, e.g., quarterly data

In general:

Moving Average Errors: MA(q) In some cases (economic) theory suggest that only particular error terms are correlated while all others have a zero correlation. This can be modelled by a so-called moving average error process:

All finite order MA processes are covariance stationary. For a MA(1) process it is easily verified that:

Be able to explain and run the Durbin-Watson and the Breusch-Godfrey tests for autocorrelation

Durbin-Watson Test (AR(1)) Two important assumptions underlying this test are that we can treat the ’s as deterministic and that contains an intercept. The first, excludes the inclusion of lagged dependent variables in the model The Durbin Watson statistic (DW) is given by:

For the DW test there are two critical values:

Breusch-Godfrey LM Test for Serial Correlation Consider the following model:

and

The test combines these two equations: (1)

The null and the alternative hypotheses are:

The steps in the test are given: Estimate eq (1) by OLS and obtain Run the following auxiliary regression

Compute the LM — statistic = (n — p)R² from the auxiliary. We reject the null hypothesis of non-autocorrelation if LM statistical is bigger than the critical value...


Similar Free PDFs