1. OLS assumption: Error term and the independent variables are not correlated.
If this assumption is violated, OLS
generates biased estimates
(expected Beta-hat is not equal to B).
Biased estimates mean we have incorrect estimates.
When we use OLS to estimate Beta-hats it will ALWAYS force
the correlation
between the independent variables and the residuals to be zero.
Therefore,
if the error (from the true model) is correlated with the independent
variable(s)
OLS generates biased Beta-hats.
This assumption is likely to be violated when estimating Demand and/or Supply equations (but there is a way to get unbiased estimates, referred to as two stage least squares). This assumption can also be violated when relevant independent variables are not included in the regression but IF and ONLY IF the omitted variables are correlated with the independent variables already in the regression model (often referred to as "omitted variable bias"). It is quite likely you may have this problem in your project. The example below shows this bias when we omit variables.
Suppose the true model is
Y = B0 + B1*X1 + B2*X2
+ e (equation 1)
and X1 = B5*X2 +
ex1
(equation 2)
where Bs are the true coefficients and e and ex1 are error
terms.
The true model indicates that X1 and X2 are correlated (through B3 -
that's why we need equation 2).
a) If we estimate the following
regression (X1 not included in
the regression)
: Y=b0 + b2*X2 + u1
(lower
case b is used for B-hat),
is expected b2 equal to B2?
Show algebraically what expected
b2 is. Show every step of your algebraic work.
NOTE: If we write equation (1) as Y=B0 + B2*X2 + e2, where e2=(B1X1+e), it is easy to see why e2 and X2 are correlated since whenever X2 changes, X1 changes AND e2! Therefore, X2 and e2 are correlated which is a violation of one of the OLS assumptions.
2. Estimations
Use data from this Excel
file for the following exercises (copy the excel file data into
Eviews
or copy the data into a new excel file to read into Eviews. Use
Eviews
for all estimations below):
The data in the Excel file is from a simulation of the following model
(this is the 'true' model):
Y = 1.0 + 2*X1 + 2*X2
+ e1
where X2, e1 and e2 are
all random variables. e1 and e2 are error terms.
and X1 = (-3)*X2 +
e2
a) Generate a correlation
matrix (or compute the pair wise correlations one by one) for
the
dependent and independent variables: Y, X1, X2.
Is Y correlated with X1 and X2? Are X1 and X2 correlated?
b) Estimate the following regressions: Y = b0 + b1*X1 +
b2*X2 + u1
where u1 are the OLS residuals
Are the estimated slope coefficients (b1 and b2) close to the true coefficients (a.k.a. population coefficients)?
Are the estimated coefficients significantly different from the true coefficients (which we know)? Specify the null and the alternative hypothesis and test the null using a two-sided test at the 10% level of significance.
Clearly state the null and the alternative hypotheses, show how you compute the t-statistics, state the critical t-value, the degrees of freedom, your criteria for rejection and your conclusion.