Title | EES 400 - Lecture notes 123 |
---|---|
Author | Anonymous User |
Course | Econometrics 1 |
Institution | Kenyatta University |
Pages | 79 |
File Size | 1.8 MB |
File Type | |
Total Downloads | 23 |
Total Views | 150 |
quatitative methods...
ECONOMETRICS: KENYATTA UNIVERSITY EES 400: FUNDAMENTALS OF ECONOMETRICS I
TOPIC ONE (1): INTRODUCTION TO ECONOMETRICS 1.1 MEANING OF ECONOMETRICS Econometrics is the social science that applies the tools of economic theory, mathematics and statistical inference to analyze economic phenomena and to estimate causal relationships among variables i.e. how a change in one variable affects the other. Thus, econometrics combines economic theory, mathematics and statistics. This amalgamation of the three subjects is vital as illustrated below. Definition of Econometrics (Greene,2003) “Econometrics is the field of economics that concerns itself with the application of mathematical statistics, and tools of statistical inference to the empirical measurement of relationships postulated by economic theory.” (a) Economic theory Economic theory forms the basis for any econometric work, and should really be the starting point for econometric analysis. However, theory itself lacks empirical content. For example, economic theory tells us that people “tend” to consume more in consumption expenditure, whenever their disposable income raises, i.e. the economic theory of consumption. However, as can be noted, the theory fails to give empirical content, e.g. by how much will consumption expenditure increases if income increases by one unit? This lacking empirical content is of interest to an econometrician. Thus, econometrics helps to empirically verify economic theory (b) Mathematical economics
Mathematical economics helps us to convert the theory into an equation, which is then empirically tested by the econometrician. Thus with mathematical economics, we can say C bY d (c) Statistics Statistics provides the econometrician with the know-how and tools of data collection, processing, analysis and presentation of results. Thus statistics helps to test theories and explain the results. Note: none of these three subjects independently constitutes econometrics. Actually it is their UNIFICATION that constitutes econometrics (ECONOMETRICA). 1.2 THE METHODOLOGY OF ECONOMETRICS Econometricians basically conduct or carry out their economic analysis in the following eight ways; i.
Economic theory
Economic theory is the starting point or basis for econometrics work. For example, in order to estimate a consumption function, one should study the economic theories of consumption, e.g. Keynesian Consumption function. Having understood the theory, you can then state your hypothesis. ii.
Mathematical model for the theory
The next step is to formulate a linear single equation mathematical model to the theory, e.g. the Keynesian consumption theory, i.e. Y=f(x) such that Y 1 2 X Such that: Y = consumption expenditure; i.e. the independent variable since it is determined within the model (or by the theory)
X = Income of the consumer,i.e.the independent or explanatory variable as it is determined from outside the model. 1 = intercept coefficient (i.e. the value of Y when X=0) 2 = Slope coefficient (i.e. the change in Y brought about by a change in X. thus, it is the marginal propensity to consume (MPC) in a consumption function)
Y 1 2 X
Consumption Expenditure (C)
Y
X
Slope =
Y 2 mpc, such that 0>>>> 1 TSS TSS TSS TSS TSS TSS TSS
Step 5: The goodness of fit. ESS TSS
Now, the ratio
is called the goodness of fit ( r 2 ). Therefore:
ESS y2 RSS ..or ..r 2 1 . 1 r 2 TSS y TSS 2
xy x . y 2
Another formula for r 2 is: r 2
2
e y
2
i
2
2 2 2 x or r .. .. 2 2 y
Recall
r 1 2
e y
2
i 2
..1
9.806 244
r 2 0.9598
xy x . y 2
2 Or, r
2
1390 0.9598.or 95.98% 8250 244 2
2
x2 Or, r 2 2 y2
8250 0.16852 0.9598.or.95.98% 244
CONFIDENCE INTERVAL ESTIMATION Confidence interval estimation aims at constructing an interval around the OLS estimators. The confidence interval for the slope coefficient is given as follows:
t .se 2
t .se . 1 2
Where: -
is the estimated OLS estimator for
t Is the critical t value for a two tailed test at n-k degrees of 2 freedom.
-
se
is the standard error of the slope coefficient ( )
-
-
1- is the coincidence interval, e.g. 99%, 95%, and 90%.
is the level of significance, e.g. 1%, 5%, and 10%
The figure below illustrates the confidence interval for :-
t .se 2
t .se
1
2
2
2
In the diagram above, the shaded part is the rejection region, while the un-shaded part is the acceptance region. The following table shows the appropriate critical t values at various levels of significance and at one-tail and two-tail tests:
Level of significance
One-tail
Two-tail
t-critical,one
t-critical,two
tail
tail
1-
=1%
0.01
0.005
2.326
2.576
99%
=5%
0.05
0.0025
1.645
1.960
95%
0.10
0.05
1.282
1.645
90%
=10%
For example, in order to calculate a 95% confidence interval for at a two-tailed test is obtained as follows.
=0.1685; n=10; k=2 ; n-k= (10-2)=8 degrees of freedom ; and
1- =0.95 ; hence =5%=0.05 Thus:
se
0.1219 and
t ,8df .se
2
t ,8df .se 2
1
t 0.025 ,8 df .se t0.025 ,8df .se
95%
0.1685 2.306 0.1219 0.1685 2.306 0.1219 95% 0.1685 0.02811 0.1685 0.02811 95% 0.1404 0.1966 95%
Hence, the 95% confidence interval for is 0.1404 0.1966 HYPOTHESIS TESTING A hypothesis is a guess or a hunch about something. By hypothesis is testing, we mean: “can our regression results be trusted?” or also, “Do our regression estimates matter?” There are 2 types of hypothesis. -
The null hypothesis
-
The alternative hypothesis
The null hypothesis is the hypothesis of interest. It is usually denoted by Ho. For
example, to test whether the slope coefficient is significant, we state: Ho: =0. The alternate hypothesis is the hypothesis that is tested against the hypothesis of interest, i.e. the null hypothesis. The alternate hypothesis is denoted is denoted by H 1 or HA. For example, the alternate hypothesis to test whether the slope coefficient is significant, we state as follows:
-
H1: 0 for the case of a two-tailed test
-
H1: >0 or H1: t-critical. Thus, according to our decision rule, we reject the Null hypothesis but do not reject (accept) the alternative hypothesis.
In conclusion, we can therefore say that is not equal to zero, or we could say, is statistically different from zero. For the second set of hypothesis, we can obtain t-calculated as follows:
t calculated
* se
0.1685 0.16 0.6973 0. 01219
The value for t-critical will remain the same t t-critical=2.306. Upon comparing t-calculated and t-critical, we notice that: t calculated t critical . Thus, following the decision rule,
we do not reject(accept) the null hypothesis. In conclusion, we can therefore say that is statistically equal to 0.16. POINT TO NOTE: The conclusions from the confidence interval approach actually resemble the conclusions from the test of significance approach and this must always be so. Indeed, the confidence interval approach is simply a mirror image of the test of significance approach.
3. HYPOTHESIS
TESTING
USING
THE
PROBABILITY
(P)
VALUE
APPROACH The probability (P) value approach is also an ideal way for testing hypothesis. The P-value states the smallest level of significance ( ) for which the null hypothesis can be rejected. The beauty with P-value approach is that most computer soft ware (Excell, SPSS, STATA, Eviews, SHAZAM, RATS, etc) automatically provide this P-value whenever you run a regression. For example, if the software reports a P-Value of 0.07, it means there is a 7% chance that we can reject the Null hypothesis. Thus, we can reject the Null hypothesis at 10% , but we cannot reject the null hypothesis at 5%or 1% The table below summarizes some P-values and significance level. P-value
Details
Is coefficient significant at 1%
5%
10%
Yes
Yes
Yes
No
Yes
Yes
No
No
Yes
No
No
No
P=0.0000
is significant at all levels
P=0.035
is significant at 3.5%
P=0.074
is significant at 7.4%
P=0.1025
is significant at 10.25%
In summary, the smaller the P-Value, the more significant is .
REGRESSION ANALYSIS AND ANALYSIS OF VARIANCE Analysis of variance (ANOVA) is a study of Total Sum of Squares, (TSS) and its components, i.e., Explained Sum of Squares (ESS) and Sum of Squared residuals (RSS). The concept here is that: ESS + RSS = TSS
y
2
u 2 y 2
By dividing the sum of squares (SS) by their associated degrees of freedom (df), we get the mean sum of squares (MSS). The Anova table therefore shows the sum of squares (SS), degrees of freedom (df), mean sum of squares ( MSS) and source of variation.
Source of variation
Sum of squares (SS)
Due to regression
y
(ESS) Due (RSS)
to
residuals
u
2
2
or
2
x
2
df
Mean sum of squares (MSS)
k-1
ESS 2 x 2 MSS reg df k 1
n-k
RSS u 2 MSS res nk df
y
Total (TSS)
n-1
2
MSS reg F
MSS res
From the ANOVA table, the F statistic is computed as follows:
Mean sum of squares due to regression ESS k 1 F RSS Mean sum of squares due to residual nk The F statistic follows the F distribution with (k-1) degrees of freedom on the numerator and (n-k) degrees of freedom on the denominator. The F statistic is used to test for overall significance of the model. If F-calculated >F-critical, the model is statistically significant If F-calculated Fcrit) Conclusion: The overall model is statistically significant. MULTIPLE REGRESSION ANALYSIS The multiple regression models are ones with more than one explanatory variable. For
example, a multiple regression model is expressed as: Y 0 1 X 1 2 X 2 e Therefore multiple regression is more realistic compared to simple regression analysis because hardly is it the case that Y is explained by a single explanatory variable. From the regression model,
as
expressed above,
the error
term is thus
expressed as:
ei Y 0 1 X 1 2 X 2 2
2 Thus, the sum of squared residuals (RSS) is as follows: ei Y 0 1 X 1 2 X2 .
Recall that the aim of Ordinary least Squares is to obtain the OLS estimators ( 0 , 1 , and 2 that will minimize the sum of squared residuals ( e i ) . Thus, to obtain the OLS 2
estimators, we need to obtain the first order partial derivatives of RSS and equate them to zero, i.e. optimization.
#
ei
0
2
2 Y
0 1 X 1 2 X 2 . 1 0
e i
2
0
0 2 Y 0 1 X 1 2 X 2 2 2
... Y 0 1 X 1
.... Y n
0
1
X
2
X 2 0
1
2
X
2
0
Thus,
n 0
#
ei
2
1
X
1
2
X
2
Y ........................................................(1)
2 Y 0 1 X 1 2 X 2 . X 1 0
1
2. 2
Y. X
1
0
0 2 X 1 1 X 1 2 X 1 X 2 2
Y . X 1 0 X1 1 X1 2 X 1 X 2 0 2
Thus,
2 0 X 1 1 X 1 2 X 1 X 2 Y X . 1 .........................................( 2)
ei
2
2
2 Y 0 1 X 1 2 X 2 . X 2 0
2 0 Y .X 2 0 X 2 1 X 1 X 2 2 X 2 2 2
........Y .X 2 0
X
2
1 X 1 X 2 2 X 2 0
2
2
0 X 2 1 X 1 X 2 2 X 2 Y .X 2 .......................................(3) The result of optimization is three first-order NORMAL EQUATIONS which are now reproduced as follows.
n 0 1 X1 2 X 2 Y
0 X 1 1 X1 2 2 X1 X2 Y. X1
0 X 2 1 X 1 X 2 2 X 2 Y . X 2
We can now represent these equations into matrix form as follows.
n X X 1 2 0 Y 2 X1 X1 1XX 2.1Y.X1 2 Y.X 2 X2 1XX 2 X2 2
Famously A.X=D, obtain determinant A.
In order to obtain the values for the OLS estimators 0 , 1 , and 2 , we can employ crammer’s Rule as follows:
o To get 0 , replace the first column of matrix A with column vector D and get the determinant of the new matrix A1. Then divide determinant A1 by original determinant.
o To get 1 , replace the second column of matrix A with column vector D and get the determinant of the new matrix A2. Then, divide determinant A2 by the original determinant A.
o To get 2 , replace the third column of matrix A with column vector D and get the determinant of the new matrix A 3. Then divide determinant A3 by the original determinant A. Example: QUE: Given the following data, regress Y on X 1 and X2. Y
6
10
9
14
7
5
X1
1
3
2
-2
3
5
X2
3
-1
4
6
2
4
SOLUTION
Y
X1
Y2
X2
X12
X22
YX1
YX2
X1X2
6
1
3
36
1
9
6
18
3
10
3
-1
100
9
1
30
-10
-3
9
2
4
81
4
16
18
36
8
14
-2
6
196
4
36
-28
84
-12
7
3
2
49
9
4
21
14
6
5
5
4
25
25
16
25
20
20
51 12 18 487 52 82 72 162 22 .n=6. Thus, we can write as follows:-
n X X 1 2 0 Y 6 12 18 0 51 2 X1 X1 1XX 2.1Y.X1 12 52 2.172 2 Y.X 2 18 2 82 162 X2 1XX 2 X2 2 2
....
According to Cramer’s rule:
51 12 18 72 52 2 162 2 82 1,4 580 0 1.1 7857 6 12 18 3,528
12 52 2 18 2 82 6 51 18 12 72 2 18 162 82
4,284 1 1.2143 6 12 18 3,528
12 52 2 18 2 82
6 1 2 51 12 52 72 18 2 162 1,0 8 0 0.2857 6 12 18 3,528
12 52 2 18 2 82 Hence, the regression of Y, X1, and X2 is as follows: Y 11.7857 1.2143 X1 0.2857 X2
Interpretation:
0 11.7857, the average or expected value of Y is 11.7857
1 -1.2143 Holding X3 constant, a unit increase in X 2 will lead to a 1.2143 units decrease in Y
2 0.2857 Holding X2 constant, a unit increase in X3, will lead to a 0.2857 decrease in Y. Finding the OLS estimators using Deviation Method
We can also obtain the OLS estimators 0 , 1 , and 2 using the deviation method as follows:
2
x1 x 21 1 yx1 . 2 x 21 x2 yx2
_
_
_
And 0 Y 1 X 1 2 X 2
Where: X 1
_
_
x1 X1 X 1
X1
X 2
_
_
x2 X 2 X 2
<...