Econometrics lecture Chapter 2 note pdf-1 PDF

Title Econometrics lecture Chapter 2 note pdf-1
Author Gisha Bekele
Course MATHEMATICAL ECONOMICS
Institution Jimma University
Pages 34
File Size 819.2 KB
File Type PDF
Total Downloads 74
Total Views 148

Summary

good...


Description

Chapter 2 Simple linear regression mode OR the Classical linear regression model Introduction: - In economics the relationship between variables are mainly explained in the form of dependent & independent variables. The dependent variable is that variable which its average value is computed using the already known values of the explanatory variable(s). But the values of the explanatory variables are obtained from fixed or in repeated sampling of the population. Example Suppose the amount of commodity demanded by an individual is depend on the price of the commodity, income of individual, price of other goods & etc. Then from this statement quantity demanded is the dependent variable which its value is determined by the price of the commodity and income of the individual, Price of other goods etc. And price of the commodity, income of individuals & price of other goods are independent (explanatory) variables whose value is obtained from the population using repeated sampling. The relationship between these dependent and independent variable is a concern of regression analysis. i.e. Qd = f (P, P0, Y etc) -------------------- (2.1) If we study the relationship between dependent variable & one independent variable i.e. Qd= f (P) this is known as simple two variable regression model because there are one dependent Qd & one independent P regression model. However if the dependent variable is depending upon more than one independent variables such as Qd: f (P, P0, Y) it is known as multiple regression analysis. The functional relation ship between the dependent and independent variable may be linear or non-linear. 2.1 The simple linear regression Analysis The relationship between the dependent & independent variable suggested by economic theory is usually specified as exact or deterministic relationships. But in reality the relationship between economic variables are inexact or stochastic or in deterministic in nature. Ex. Suppose consumption expenditure for a commodity is depending up on current income of the individual citrus-paribus & assumes that the functional relationship is linear then we can write it. Ct =  +Yt ------------------------------------------------------2.2 Then for each specific value of current income (Yt) there will be only one corresponding value of consumption expenditure. This shows that consumption expenditure is depend upon current income. But consumption expenditure is not only determined by income alone but different variables such as wealth, previous income, tradition etc affect consumption expenditure. Then there is inexact relation ship between these two variables and to capture those factors which affects consumption expenditure in equation 2.2 we in corporate a variable „U‟. Then we can write the equation as follows Ct =  +Yt + Ut----------------------------------------- (2.3) Where Ct is the dependent variable, Yt is independent variable,  &  are regression parameters, Ui is the stochastic disturbance term or error term. We introduce „U‟ – random term due to the following reasons. i. Omission of variables from the function. In economic reality each variable is influenced by very large number of factors and each variable may not be included in the function because of a) Some of the factors may not be known. b) Even if we know them the factors may not be measured statistically example psychological factors (test, preferences, expectations etc) are not measurable c) Some factors are random appearing in an unpredictable way & time. Example epidemic earth quacks e.t.c. 1 | P a g e Jimma University; Department of Economics

BY Alemu A.and Getache K.( Phd candidates)

d) Some factors may be omitted due to their small influence on the dependent variables e) Even if all factors are known, the available data may not be adequate for the measure of all factors influencing a relationship ii. The erratic nature of human beings:- The human behavior may deviate from the normal situation to a certain extent in unpredictable way. iii. Misspecification of the mathematical model:- we may wrongly specified the relationship between variables. We may form linear function to non- linearly related relationships or we may use a single equation models for simultaneously determined relationships. iv. Error of aggregation: - Aggregation of data introduces error in relationship. In many of Economics data are available in aggregate form ex. Consumption, income etc is found in aggregate form which we are added magnitudes referring to individuals where behavior is dissimilar. v. Errors of measurement:- when we are collecting data we may commit errors of measurement In order to take in to account the above source of error we introduce in econometric functions a random term variable which is usually denoted by the latter U & is called error term, random disturbance term or stochastic term of the function. By introducing this random term variable in the function the model will be just like equation number (2.3). The relationship between variables will be split in to two parts. Example From equation (2.3)  +Yt represents the exact relationships explained by the line  Apart represented by the random term Ui is the unexplained part by the line. This can be explained using the following graph.

Y

Ct= =  +yt

Yn

Ct Consumption

Un Un Yn

U2

Y1 0

X Current income (Yt)

Figure 1 The line Ct: =  +Yt shows /explain/ the exact relation ship between consumption & income but other variables that affect consumption expenditure are scattered around the straight line. Then the true relationship is explained by the scatter of observations between Ct &Yt. 2 | P a g e Jimma University; Department of Economics

BY Alemu A.and Getache K.( Phd candidates)

=  +Yt

Ct

Variation in

+

Explained

+=

consumption

Ut ------------------------ (2. 4)

Unexplained

++

variation

variation

To estimate this equation we need data on Ct, Yt &Ut, since Ut is never observed like other variables (Ct & Yt) we should guess the value of „U‟, that is we should make some assumptions about the shape of each Ui (mean, S.E, Covariance etc) 2.2 Assumptions of the linear stochastic regression model. To guess the value of „U‟ we make some assumptions about Ui & divided these assumptions in to three a) Some refer to the distribution of random variable Ui. b) Some to the relationship between Ui & the explanatory variables c) Some refer to the relationship between the explanatory variables themselves 2.2.1 Assumption about Ui i) Ui – is a random real variable:- The value which Ui-may assume in any one period depends on chance, it may be positive, negative or zero. ii) The mean value of Ui in any particular period is zero i.e. E(Ui) = 0 or

 ui  0 i 0

iii) Homoscedasticity: (Constant Variance). The variation of each Ui around all values of the explanatory value is the same i.e. the deviation of Ui around the straight line (in figure 1) is 2 remain the same var (Ui)= u iv) The variable Ui has a normal distribution with mean zero & variance of Ui. 2 Ui is N(0, u ) v) Ui is serially independent:- the value of U in one period is not depend up on the value of Ui in other period of time means the co-variance between Ui & Uj is equal to zero Cov (UiUj) = 0 Cov (UiUj) = E [ Ui – E (Uj)] [Uj –E(U)] By assumption ii – the E(Ui) = 0 then =

E

[Ui-0] [Uj-0] = E(Ui) E(Uj)

Again by assumption E(Ui) = 0 Cov (UiUj ) = 0 2.2.1 Assumption about Ui & Xi iThe disturbance term Ui is not correlated with explanatory variables. It means Ui‟s & Xi‟s are not moving together or the covariance between Ui & Xi‟s are zero Cov (UiXi) =0 Cov (UiXi) E [Ui  E(Ui)][ Xi  E( Xi)] 3 | P a g e Jimma University; Department of Economics

BY Alemu A.and Getache K.( Phd candidates)

ii-

By assumption we have E(Ui)=0 then = E{[Ui-0][Xi-E(Xi)]} = E{UiXi-UiE(Xi)] = E(UiXi)-E(Ui)E(Xi)] Again by assumption E(Ui)=0 = E(UiXi)-0E(Xi)] = E(UiXi) since the value of Xi's are fixed then =XiE(Ui)=0 Cov (UiUj)=0 The explanatory variables Xi's are measured with out error i.e no problem of aggregation, round off etc. If there is such problem in the measurement it will be absorbed by the random term Ui.

2.2.2 Relation ship about explanatory variables If there are more than one explanatory variable the relationships is assumed that they are not perfectly correlated with each other. Ex. Yt=   1 X 1   2 X 2   3 X 3  Ui X1&X2, X2&X3, X1&X3 are not correlated with each others. i.e. no multicollinearity. The distribution of the dependant variable Y Given the following relationship between variables Yi =    Xi  Ui Mean of Yi (Expected value of Yi) can be found as follow E(Yi)= E[   Xi  Ui ] E(Yi) =    Xi  E (Ui) Where E (Ui) = 0 by assumption E(Y) =   Xi ----- is the mean value of the dependent variable Yi Variance of Yi =

Var (Yi) = E [Yi-E (Yi)]2 Substitute in place of E(Y) =   Xi Var (Yi) =E [Yi-E (    X ) ]2 Again in place of Yi substitute Yi =   Xi  Ui Var (Yi) = E

   Xi  Ui    X i 2

Var (Yi)= E(Ui)2 From our previous assumption the variance of Ui is equal to E (Ui)2 =u2 then Var (Yi)=E(Ui)2 = u2 which is constant. The distribution of Y with mean & variance will be Yi_ N (   Xi,  u ) 2

4 | P a g e Jimma University; Department of Economics

BY Alemu A.and Getache K.( Phd candidates)

2.3 Estimation of the model The relationship Yi =   Xi  Ui                2.5 Holds for population of the values X&Y. Since these values of the population are unknown we do not know the exact numerical values of  & β' s. To calculate or obtain the numerical values of  & β we took sample observations for Y & X. By substituting these values in the population regression we obtain sample regression which gives an estimated value of  & β given by

ˆ & ˆ respectively then the sample regression line is given by Yˆi  ˆ  ˆXi                2.6 The true relationships between variables (that explain the population) is given by

Yˆi    Xi  Ui                  2.7 If you estimated this relationship using sample observation we get the estimated relationship which has the following

Yˆi  ˆ  ˆXi  Ui                2.8 We can estimate the value of &β using least square method (OLS) or classical least squares (CLS).The reasons to start or use OLS or CLS methods are many i. The parameters obtained by this methods have some optimal properties i.e. BLUE (Best, Linear, Unbiased Estimators). ii. The computational procedure of OLS is fairly simple as compared to other econometric methods. iii. OLS is one of the most commonly employed methods in estimating econometric models. iv. The mechanics of OLS is simple to understand. v. OLS is an essential component of most other econometric techniques From the sample observations we will have

Yˆi  ˆ  ˆXi  ei                2.9 ei  Yˆi  ˆ  ˆXi                2.10 Finding values for the estimates ˆ & ˆ which will minimize the square of residuals ^

 ei i 1

2



^

[Yi  ˆ  ˆxi]

2

 ei

2

             2.11

i 1

To find the values of  & β that minimize this sum, we have to differentiate with respect to

ˆ & ˆ

& set the partial derivatives equal to zero

  ei 2   2 (Yi  ˆ  ˆXi)  0            2.12 ˆ 5 | P a g e Jimma University; Department of Economics

BY Alemu A.and Getache K.( Phd candidates)

 ei 2   2 (Yi ˆ  ˆ Xi) Xi  0             2.13 ˆ First take equation number 2.12to find the value of ˆ

 2 (Yi  ˆ  ˆXi)  0

Run the sum over the equation

 2 (Yi   (ˆ  ˆXi)  0

ˆ  nˆ  2 Yi  2nˆ  2ˆ  Xi  0  2Yi  2n ˆ  2 ˆ  Xi =0 2 Yi  2ˆ  Xi  2nˆ Divided by 2n to get ˆ

ˆ 

 Yi  ˆ  Xi n

Yi n

n

Y

&

 Xi  x n

ˆ  Y  ˆx                2.14 Take equation number (2.13) to find the value of ˆ  2 (Yi  ˆ )  (2ˆ Xi) X  0  2 (YiXi  ˆXi ˆXi 2 )  0

Multiply by X Sum it over

 2( YiXi  ˆ Xi ˆ  Xi 2 )  0

 YiXi  ˆ  Xi ˆ Xi

2

              2.15

Substitute equation (2.14 ) in to equation (2.15 )

 YiXi  (Y  ˆ x) Xi  ˆ  Xi  YiXi  Y   Xi  ˆx Xi  ˆ Xi 2

6 | P a g e Jimma University; Department of Economics

2

BY Alemu A.and Getache K.( Phd candidates)

YXi  Y  Xi  ˆx  Xi  ˆ  Xi Y & x   X substituted We know that Y  2

n

n



Yi  Xi   ˆ  Xi  Xi  ˆ Xi YiXi  n

n 2 Xi ( ˆ  )

2

 Yi  Xi  YiXi  n    n  ˆ  Xi Multiplied both sides by n n YiXi  Yi Xi  ˆ ( Xi)  ˆ Xi n n YiXi   Yi Xi  nˆ Xi  ˆ  Xi  n YiXi   Yi Xi  ˆ n Xi  ( Xi )  n YiXi   Yi  Xi ˆ                   2.16 n Xi  ( X ) 2

2

2

2

2

2

2

2

2

The numerical value of ˆ & ˆ can be found in deviation forms. To write the above equation number 2.16 in deviation form Take the numerator which is

n  Xi Yi   Yi  Xi

Yi  Xi n Xi Yi  Yi  Xi  n Xi Yi  Yi X  Xi Yi Xi Yi  =n Xi Yi  Yi  Xi  Xi Yi   Xi Yi =n  Xi Yi  Yi  Xi  Xi Yi   Xi Yi Yi Xi n  Xi Yi  n  X n Y = n  Xi Yi    n n n n Added & subtracted

Take n in common



n  XiYi  Y  X  x Y  nx y



n Xi  x Yi  y  -------------------------2.17

This equation is equal to the numerator of equation number 2.16. Again from equation 2.16 take the denominator





n Xi 2   Xi   n  Xi 2  2  Xi   Xi  2

2

 n Xi 2  2 Xi  Xi   Xi 

2

2

 n Xi2  2 nx  X  n 2 x 2  n

 Xi

2

 2 x  X  nx 2

 n ( xi 2  x) 2                 2.18 By taking equation 2.17 as numerator & 2.18 as denominator 7 | P a g e Jimma University; Department of Economics

BY Alemu A.and Getache K.( Phd candidates)

ˆ 

n ( X1  X )( Y1  Y) n X 1  X ) 2

X 1  X  xi & Yi  Y  yi Substitute in the above equation n xyi ˆ   2 n xi

ˆ 

 xiyi                  2.19  xi 2

2.4 Statistical tests of Estimates The two most commonly used tests in econometrics are r2 i.e. square of correlation coefficient & the standard error of tests ( s.e.).  The Square of Correlation Coefficient = r2 /R2/ When we estimate a model of two variable case (one independent (X) & one dependent variable Y) we find r2. But if we have more than two variable case (one dependent variable & more than one independent variables (X1,X2...Xn) we will have the coefficient of determination R2. Definition of r2/R2/After estimation of ˆ & ˆ from the sample data observations of Y & X using OLS method, we need to know how 'good' is this fit of the line to the sample observations of Y&X. Means measure the dispersion of the sample observation around the regression line. The closer the observations to the line the better is the explanation of the variation of Y by the change in the explanatory variables (X's) r 2 shows the percentage of the total variation of the dependent variable that can be explained by the independent variable X.

Y

Unexplained variation X

Yi = ˆ  ˆxi

Total variation

Y Explained variation

X 8 | P a g e Jimma University; Department of Economics

BY Alemu A.and Getache K.( Phd candidates)

Suppose a researcher may have Yi=+βXi+ Ui model. To estimate this model he took some sample observation to estimate the value of  & β. In his estimation all the data may fall below, above or on the line. Then using R2 he can observe that whether the regressions line will give the best fit for these data or not.  Yi is the observed sample value  Y is the mean value of the sample  Yˆ is the estimated regression line using sample data  Yi - Y  shows by how much the actual sample value is deviating from the sample mean value. This is called total variation represent by small y. 

Yˆi  Y Explain by how much the estimated values are deviating from the sample mean value. This is called explained variation & represent by yˆi



Y  Yˆi This also shows that the difference between the actual value of Yi & the estimated value of Yi ( Yˆi ). This is called unexplained variation represent by ei. Therefore; Total variation: yi  Yˆi  Y                2.20 Explained variation yˆ  Yˆ  Y                2.21

Unexplained variation ei  Yi  Yˆi                2.22 Sum it over each equation & squared it. We will have

yi



2

(Yi  Y )

2

               2.23

We square it because the sum of deviation of any variable around its mean value is zero then to avoid this we make it squared.

 yˆi   (Yˆi Y)  ei   (Yi Yˆ) 2

2

                2.24

2

2

                 2.25

We can write equation no (2.20) as follows

yi  Yˆi  Y  Yi  yi  Y                2.26 From equation (2.21)

yˆi  Yˆ  Y  Yˆi  yˆi  Y            2.27 Substitute these equations in equation number 2.21 ei = Yi- Yˆ from the above equation (2.26 &2.27 ).

Yi  yi  Y &

Yˆ  yˆ  Y

ei  yi  Y  ˆyi  Y  ei  yi  Y  ˆy  Y ei  yi  yˆ i yi  ei  yˆi                2.28 This shows that each deviation of the sample observed values of Y from its mean Yi  Yˆ  yi consists of two components 9 | P a g e Jimma University; Department of Economics

BY Alemu A.and Getache K.( Phd candidates)

i. yˆi  Yˆi  Y which shows the explained amount by the regression line ii. ei  Yi  Yˆ = the unexplained variation by the regression line By Taking equation number 2.28 yi  yˆ i  ei Sum it over

 yi   ( yˆ i  ei) ^

 yi

2

i 1

 yi From this equation 2

2



 (yˆi  ei ) i1



Squared it

^

 yˆi

 yˆ i ei

2

 2 yˆei 

2

ei

2

                2.29

is equal to zero. We can prove it.

yˆ  Yˆ  Y We know that from equation 2.14 Y    ˆxi again Y  ˆ  ˆx Then if we substitute these in yi  y  Y yi  ˆ  ˆXi  ˆ  ˆx yi  ˆ ( Xi  x) ( Xi  x )  xi is in deviation from yi  ˆxi            2.30 We also know that ei  yi  yˆ i since yi  ˆxi Substitute ei  yi  ˆxyi                2.31

 yˆiei  from  equation  2.29 ˆxi ( yi  ˆxi) = ˆ  xi (yi  ˆxi) = ˆ  xyi  ˆ  xi  xiyi then substitute in ...


Similar Free PDFs