EES 400 - Lecture notes 123 PDF

Title	EES 400 - Lecture notes 123
Author	Anonymous User
Course	Econometrics 1
Institution	Kenyatta University
Pages	79
File Size	1.8 MB
File Type	PDF
Total Downloads	23
Total Views	150

Preview

CLICK TO PREVIEW PDF

Summary

quatitative methods...

Description

ECONOMETRICS: KENYATTA UNIVERSITY EES 400: FUNDAMENTALS OF ECONOMETRICS I

TOPIC ONE (1): INTRODUCTION TO ECONOMETRICS 1.1 MEANING OF ECONOMETRICS Econometrics is the social science that applies the tools of economic theory, mathematics and statistical inference to analyze economic phenomena and to estimate causal relationships among variables i.e. how a change in one variable affects the other. Thus, econometrics combines economic theory, mathematics and statistics. This amalgamation of the three subjects is vital as illustrated below. Definition of Econometrics (Greene,2003) “Econometrics is the field of economics that concerns itself with the application of mathematical statistics, and tools of statistical inference to the empirical measurement of relationships postulated by economic theory.” (a) Economic theory Economic theory forms the basis for any econometric work, and should really be the starting point for econometric analysis. However, theory itself lacks empirical content. For example, economic theory tells us that people “tend” to consume more in consumption expenditure, whenever their disposable income raises, i.e. the economic theory of consumption. However, as can be noted, the theory fails to give empirical content, e.g. by how much will consumption expenditure increases if income increases by one unit? This lacking empirical content is of interest to an econometrician. Thus, econometrics helps to empirically verify economic theory (b) Mathematical economics

Mathematical economics helps us to convert the theory into an equation, which is then empirically tested by the econometrician. Thus with mathematical economics, we can say C   bY d (c) Statistics Statistics provides the econometrician with the know-how and tools of data collection, processing, analysis and presentation of results. Thus statistics helps to test theories and explain the results. Note: none of these three subjects independently constitutes econometrics. Actually it is their UNIFICATION that constitutes econometrics (ECONOMETRICA). 1.2 THE METHODOLOGY OF ECONOMETRICS Econometricians basically conduct or carry out their economic analysis in the following eight ways; i.

Economic theory

Economic theory is the starting point or basis for econometrics work. For example, in order to estimate a consumption function, one should study the economic theories of consumption, e.g. Keynesian Consumption function. Having understood the theory, you can then state your hypothesis. ii.

Mathematical model for the theory

The next step is to formulate a linear single equation mathematical model to the theory, e.g. the Keynesian consumption theory, i.e. Y=f(x) such that Y  1   2 X Such that: Y = consumption expenditure; i.e. the independent variable since it is determined within the model (or by the theory)

X = Income of the consumer,i.e.the independent or explanatory variable as it is determined from outside the model. 1 = intercept coefficient (i.e. the value of Y when X=0)  2 = Slope coefficient (i.e. the change in Y brought about by a change in X. thus, it is the marginal propensity to consume (MPC) in a consumption function)

Y  1   2 X

Consumption Expenditure (C)

Y

X

Slope =

Y  2 mpc, such that 0>>>> 1  TSS TSS TSS TSS TSS TSS TSS

Step 5: The goodness of fit. ESS TSS

Now, the ratio

is called the goodness of fit ( r 2 ). Therefore:



ESS  y2 RSS  ..or ..r 2 1  . 1  r  2 TSS  y TSS 2

 xy     x . y 2

Another formula for r 2 is: r 2

2

e y

2

i

2



2   2 2 x    or r .. ..   2 2   y 

Recall

r 1 2

e y

2

i 2

..1 

9.806 244

r 2 0.9598

  xy   x . y 2

2 Or, r 

2

1390  0.9598.or 95.98%  8250 244 2

2



 x2 Or, r 2  2    y2 

 8250  0.16852  0.9598.or.95.98% 244 

CONFIDENCE INTERVAL ESTIMATION Confidence interval estimation aims at constructing an interval around the OLS estimators. The confidence interval for the slope coefficient  is given as follows: 

  t .se  2

      



   t .se   . 1   2

Where: -

    



 is the estimated OLS estimator for 

t Is the critical t value for a two tailed test at n-k degrees of 2 freedom.

-

se 

      



is the standard error of the slope coefficient (  )

-



-

1-  is the coincidence interval, e.g. 99%, 95%, and 90%.

is the level of significance, e.g. 1%, 5%, and 10%



The figure below illustrates the confidence interval for  :-





  t .se    2

  t  .se  



     





1 

2

     

2

2

In the diagram above, the shaded part is the rejection region, while the un-shaded part is the acceptance region. The following table shows the appropriate critical t values at various levels of significance and at one-tail and two-tail tests:

Level of significance

One-tail

Two-tail

t-critical,one

t-critical,two

tail

tail

1-

 =1%

0.01

0.005

2.326

2.576

99%

 =5%

0.05

0.0025

1.645

1.960

95%



0.10

0.05

1.282

1.645

90%

=10%

For example, in order to calculate a 95% confidence interval for  at a two-tailed test is obtained as follows. 

 =0.1685; n=10; k=2 ; n-k= (10-2)=8 degrees of freedom ; and

1- =0.95 ; hence  =5%=0.05 Thus:

se  

       



0.1219 and





  t ,8df .se 

      

2

   t  ,8df .se  2

      

1  





  t 0.025 ,8 df .se      t0.025 ,8df .se 

      

     

95%

0.1685  2.306 0.1219    0.1685   2.306 0.1219 95% 0.1685 0.02811   0.1685  0.02811 95% 0.1404    0.1966 95%

Hence, the 95% confidence interval for  is 0.1404  0.1966 HYPOTHESIS TESTING A hypothesis is a guess or a hunch about something. By hypothesis is testing, we mean: “can our regression results be trusted?” or also, “Do our regression estimates matter?” There are 2 types of hypothesis. -

The null hypothesis

-

The alternative hypothesis

The null hypothesis is the hypothesis of interest. It is usually denoted by Ho. For 

example, to test whether the slope coefficient is significant, we state: Ho:  =0. The alternate hypothesis is the hypothesis that is tested against the hypothesis of interest, i.e. the null hypothesis. The alternate hypothesis is denoted is denoted by H 1 or HA. For example, the alternate hypothesis to test whether the slope coefficient is significant, we state as follows: 

-

H1:  0 for the case of a two-tailed test

-

H1:  >0 or H1:  t-critical. Thus, according to our decision rule, we reject the Null hypothesis but do not reject (accept) the alternative hypothesis.





In conclusion, we can therefore say that  is not equal to zero, or we could say,  is statistically different from zero. For the second set of hypothesis, we can obtain t-calculated as follows: 

t  calculated 

  * se  



0.1685  0.16 0.6973 0. 01219

    

The value for t-critical will remain the same t t-critical=2.306. Upon comparing t-calculated and t-critical, we notice that: t  calculated  t  critical . Thus, following the decision rule, 

we do not reject(accept) the null hypothesis. In conclusion, we can therefore say that  is statistically equal to 0.16. POINT TO NOTE: The conclusions from the confidence interval approach actually resemble the conclusions from the test of significance approach and this must always be so. Indeed, the confidence interval approach is simply a mirror image of the test of significance approach.

3. HYPOTHESIS

TESTING

USING

THE

PROBABILITY

(P)

VALUE

APPROACH The probability (P) value approach is also an ideal way for testing hypothesis. The P-value states the smallest level of significance (  ) for which the null hypothesis can be rejected. The beauty with P-value approach is that most computer soft ware (Excell, SPSS, STATA, Eviews, SHAZAM, RATS, etc) automatically provide this P-value whenever you run a regression. For example, if the software reports a P-Value of 0.07, it means there is a 7% chance that we can reject the Null hypothesis. Thus, we can reject the Null hypothesis at  10% , but we cannot reject the null hypothesis at  5%or  1% The table below summarizes some P-values and significance level. P-value

Details

Is coefficient significant at  1%

 5%

 10%



Yes

Yes

Yes



No

Yes

Yes



No

No

Yes



No

No

No

P=0.0000

 is significant at all levels

P=0.035

 is significant at 3.5%

P=0.074

 is significant at 7.4%

P=0.1025

 is significant at 10.25%



In summary, the smaller the P-Value, the more significant is  .

REGRESSION ANALYSIS AND ANALYSIS OF VARIANCE Analysis of variance (ANOVA) is a study of Total Sum of Squares, (TSS) and its components, i.e., Explained Sum of Squares (ESS) and Sum of Squared residuals (RSS). The concept here is that: ESS + RSS = TSS 

y

2



  u 2  y 2

By dividing the sum of squares (SS) by their associated degrees of freedom (df), we get the mean sum of squares (MSS). The Anova table therefore shows the sum of squares (SS), degrees of freedom (df), mean sum of squares ( MSS) and source of variation.

Source of variation

Sum of squares (SS)

Due to regression

y

(ESS) Due (RSS)

to

residuals





u

2

2



or 

2

x

2

df

Mean sum of squares (MSS)

k-1

ESS  2  x 2 MSS reg  df k 1

n-k

RSS  u 2   MSS res nk df





y

Total (TSS)

n-1

2

MSS reg F

MSS res

From the ANOVA table, the F statistic is computed as follows:

Mean sum of squares due to regression ESS k  1  F RSS Mean sum of squares due to residual nk The F statistic follows the F distribution with (k-1) degrees of freedom on the numerator and (n-k) degrees of freedom on the denominator. The F statistic is used to test for overall significance of the model. If F-calculated >F-critical, the model is statistically significant If F-calculated Fcrit) Conclusion: The overall model is statistically significant. MULTIPLE REGRESSION ANALYSIS The multiple regression models are ones with more than one explanatory variable. For 





example, a multiple regression model is expressed as: Y   0  1 X 1   2 X 2  e Therefore multiple regression is more realistic compared to simple regression analysis because hardly is it the case that Y is explained by a single explanatory variable. From the regression model, 



as

expressed above,

the error

term is thus

expressed as:



ei Y   0   1 X 1   2 X 2 2

2 Thus, the sum of squared residuals (RSS) is as follows: ei   Y   0   1 X 1   2 X2  . 















Recall that the aim of Ordinary least Squares is to obtain the OLS estimators (  0 ,  1 , and  2 that will minimize the sum of squared residuals (  e i ) . Thus, to obtain the OLS 2

estimators, we need to obtain the first order partial derivatives of RSS and equate them to zero, i.e. optimization.

#

 ei 

 0

2

 2  Y 

     0   1 X 1   2 X 2 .  1 0 

 e i

2





 0

   0  2 Y      0  1 X 1   2 X 2    2    2

   ...  Y   0   1 X 1    

.... Y  n 



0

 

1

X

2

X 2  0  

1

 

2

X

2

0

Thus, 



n  0 

#

 ei

2

1

X



1



2

X

2

 Y ........................................................(1)

   2   Y   0   1 X 1   2 X 2 .  X 1  0  



 1



 2.  2

 Y. X



1

 



0

  0 2 X 1   1 X 1   2 X 1 X 2     2 



 Y . X 1   0 X1   1  X1   2  X 1 X 2  0 2

Thus, 





2  0  X 1   1  X 1   2  X 1 X 2  Y X . 1 .........................................( 2)

 ei 

 2

2







2  Y   0   1 X 1   2 X 2  .  X 2 0   

    2  0  Y .X 2   0 X 2   1 X 1 X 2   2 X 2     2    2 

........Y .X 2   0 



X



2



  1  X 1 X 2   2  X 2 0 

2

2

 0  X 2   1  X 1 X 2   2  X 2  Y .X 2 .......................................(3) The result of optimization is three first-order NORMAL EQUATIONS which are now reproduced as follows.







n 0   1  X1   2  X 2  Y 





 0  X 1   1 X1 2   2  X1 X2  Y. X1 





 0  X 2   1  X 1 X 2   2  X 2 Y . X 2

We can now represent these equations into matrix form as follows.

n X X    1  2  0 Y   2       X1 X1  1XX 2.1Y.X1  2     Y.X  2   X2  1XX 2 X2   2    

Famously A.X=D, obtain determinant A. 





In order to obtain the values for the OLS estimators  0 ,  1 , and  2 , we can employ crammer’s Rule as follows: 

o To get  0 , replace the first column of matrix A with column vector D and get the determinant of the new matrix A1. Then divide determinant A1 by original determinant.



o To get  1 , replace the second column of matrix A with column vector D and get the determinant of the new matrix A2. Then, divide determinant A2 by the original determinant A. 

o To get  2 , replace the third column of matrix A with column vector D and get the determinant of the new matrix A 3. Then divide determinant A3 by the original determinant A. Example: QUE: Given the following data, regress Y on X 1 and X2. Y

6

10

9

14

7

5

X1

1

3

2

-2

3

5

X2

3

-1

4

6

2

4

SOLUTION

Y

X1

Y2

X2

X12

X22

YX1

YX2

X1X2

6

1

3

36

1

9

6

18

3

10

3

-1

100

9

1

30

-10

-3

9

2

4

81

4

16

18

36

8

14

-2

6

196

4

36

-28

84

-12

7

3

2

49

9

4

21

14

6

5

5

4

25

25

16

25

20

20

 51  12  18  487  52  82  72  162  22 .n=6. Thus, we can write as follows:-

 n X X          1  2  0 Y  6 12 18 0 51  2             X1 X1  1XX 2.1Y.X1 12 52 2.172  2     Y.X        2  18 2 82  162   X2  1XX 2 X2   2    2          



....

According to Cramer’s rule:

51 12 18 72 52 2 162 2 82 1,4 580 0    1.1 7857 6 12 18 3,528 

12 52 2 18 2 82 6 51 18 12 72 2 18 162 82

4,284 1   1.2143 6 12 18 3,528 

12 52 2 18 2 82

6 1 2 51 12 52 72 18 2 162  1,0 8 0    0.2857 6 12 18 3,528 

12 52 2 18 2 82 Hence, the regression of Y, X1, and X2 is as follows: Y 11.7857  1.2143 X1  0.2857 X2

Interpretation: 

 0 11.7857, the average or expected value of Y is 11.7857 

 1 -1.2143 Holding X3 constant, a unit increase in X 2 will lead to a 1.2143 units decrease in Y 

 2  0.2857 Holding X2 constant, a unit increase in X3, will lead to a 0.2857 decrease in Y. Finding the OLS estimators using Deviation Method 





We can also obtain the OLS estimators  0 ,  1 , and  2 using the deviation method as follows:

 2

 x1 x 21  1   yx1    .        2     x 21 x2   yx2





_

_



_

And  0 Y   1 X 1  2 X 2

Where: X 1 

_

_

x1 X1  X 1

X1

X 2 

_

_

x2 X 2  X 2
<...