lab 1 - Toni Duras, 2019 Autumn PDF

Title lab 1 - Toni Duras, 2019 Autumn
Author Yushuo Li
Course Business Statistics 2
Institution Jönköping University
Pages 9
File Size 478.1 KB
File Type PDF
Total Downloads 92
Total Views 140

Summary

Toni Duras, 2019 Autumn...


Description

1.

The file Stock.sav contains data of mean and standard deviation of excess return of the S&P 50 stocks. Examine the relation between the two variables graphically by using a scatter plot. Calculate the correlation between two variables. a. Comment on the relationship. Is the relationship strong?

A graphical representation of individual scores on two variables is called a scatter plot. Below on the graph, standard deviations are displayed on the horizontal axis and means are displayed on the vertical axis. These two variables have a positive association because as standard deviation increases, so does mean. The points in the graph are tightly clustered about the trend line so there is a strong relationship.

2. A factory keeps records of the number of shifts worked each month and the output. On an intuitive basis it is believed that number of shifts worked is an important determinant of output. Data is available in the workingoutput.sav file. However, one may wonder how strong such a relation may be, or if it is even significant.

Consider the following model: Yi  0  1 Xi  i a.What is the interpretation of the intercept in this specific model? Is it reasonable to assume that the intercept is zero? Try to answer the question without using the data (use common sense). Then test the hypothesis H0 : 0  0 vs. HA : 0  0 at an arbitrary test level (that is, you may decide α=0.1, α=0.05 or α=0.01 yourself). State your hypotheses and alpha. Do the test using both the critical value method and the p-value approach. If you reject/accept the null hypothesis – explain what this implies in terms of the data.

x is number of shifts worked (independent) y is output (dependent)

Coefficientsa Unstandardized Coefficients Model B Std. Error (Constant) -48,393 34,423 1 shifts_worked 8,440 ,716 a. Dependent Variable: Output

Standardized Coefficients Beta t -1,406 ,921 11,791

Sig. ,172 ,000

Y=-48393+8400x

The intercept is the expected value of Y when X=0, in our case number of output when there is no shift. It is reasonable to assume it is 0 since there should not be any

H0 :  0  0 vs. HA :  0  0 With significance level of 0.05 P value: from the table of SPSS we see p value of constant is 0.172 And in comparison significance level is smaller, so we do not reject H0 So  0  0 Critical value: with df of 25 (foud from ANOVA tables residual) by using t value, we find alpha/2=0.025 is counter plus minus 2.06 And our t value is -1.4 and is smaller so we don’t reject h0 Implies we didn’t detect any value from 0

b. Test the hypothesis H0:  versus H1 : ≠  . (you may decide α=0.1, 0.05 or α=0.01 yourself). State your hypotheses and alpha (you may choose alpha). Do the test using both the critical value method and the p-value approach. If you reject/accept the null hypothesis – explain what this implies in terms of the data.?

Coefficientsa Unstandardized Coefficients Model B Std. Error (Constant) -48,393 34,423 1 shifts_worked 8,440 ,716 a. Dependent Variable: Output

Standardized Coefficients Beta t -1,406 ,921 11,791

Sig. ,172 ,000

H0: and H1 :≠atα=0.05 Using p value: the given p value is close to 0, and is way smaller than 0.05, so we reject h0 Critical value: with df of 25 (foud from ANOVA tables residual) by using t value, we find alpha/2=0.025 is counter plus minus 2.06, and our t value is 11.79 so we reject null hypothesis

We do reject the null hypothesis. Hence, there is an effect of beta 1 on output

3.Several EU countries have large government debts in the last 20-30 years. In 1993 a certain level of budget deficit was formulated as an official economic policy target by the Maastricht Treaty. Economic theory states that budget deficit may be described as follows: Budget Deficit  0   1 X1t  2 X 2t   3 X3t   4 X 4t  5 X5t  t X1  gdp (gross domestic product growth) X2 = interest rate X3  price X4  gross national debt X 5  unemployment The file “UK budget.sav” contains data of UK’s budget. Use this data to answer the following: a.

Test the hypothesis H0 :  1  ...   5  0 against the alternative hypothesis H0 : at least some  j is non-zero . Based on your test, do you think the model is useful for predictions of future Budget deficit? State your hypotheses and alpha (choose between 0,1; 0,05 or 0,01). Do the test using either critical value of the p- value approach. If you reject/accept the nullhypothesis – explain what this implies in terms of the data. (OBS. This is not covered in the video. Hint: SPSS calculates this F-test for you as you do the regression. Look at the ANOVA table).  is 5

Suppose we use alpha level of 0.05, and by looking at the p value given in the ANOVA table, the alpha is greater, so we reject H0 So not all slope parameters in a multiple regression model equals zero

b.

Test if unemployment has a significant effect on budget (i.e. test if significantly different from 0). State your hypotheses and alpha (choose between 0,1; 0,05 or 0,01). Do the test using either critical value or the p- value approach. If you reject/accept the null hypothesis – explain what this implies in terms of the data.

Suppose we use alpha of 0.05 Ho: b5 = 0 H1:b5≠0 According to the table p value of the unemployment is 0.781

since 0,781 > 0,05 so we don’t reject null hypothesis Therefore, independent variable does not have an effect on the dependent variable.

c. check if the residuals are autocorrelated or not. Hint: Either (using SPSS) plot the residuals against the lagged residuals and base your decision on this. OR, using SPSS, calculate the correlation coefficient between the residuals and the lagged residuals and

base your decision on this) Does your conclusion regarding autocorrelation have an implication on the reliability of the above hypothesis test?

According to the graph, It is not autocorrelated. Since it is not autocorrelated then it means our variable is independent, so our multiple regression model is reliable

d. By checking the histogram, the residuals do not appear to be normally distributed. Will that pose a problem for your inferences (hypothesis tests etc)?.

No, it will be normally distributed if we have more population. Since we require error term to be normally distributed so it is not a problem

4.A professor at the department of Economics hates animals. He is convinced that pigs are the major source of emissions that destroys ozone. Therefore the professor wants to investigate the impact of pigs on the ozone. However, the professor gets some comments at a seminar and he is told that he also needs to include the variable population in the regression model. a) Comment on the difference between coefficient for the simple linear regression model and the multiple linear regression model. Do you reject the null hypothesis in both models? why do you think that is? Alpha: 0.05 H0:pig meat have no effect H0:pig meat has a effect

Simple linear regression model: one unit of pig meat has an effect of 1015 to the ozone. Since the alpha value is greater than p value provided, We reject H0 in the simple linear model, so pig meat has effect on Ozone) Multiple regression model, pig meat have no effect at all Since the alpha value is smaller than p value provided so we do not reject H0 in the multiple linear regression model, so pig meat has no effect on Ozone The difference of two models is caused by not including all relevant independent variables in the regression model, then we see a false causal relationship

a. By looking at the scatterplot, does the variance of the error term look constant or not?

No, we have heteroscedasticity (as the x value increase, we have more residual) So the assumption for the test cannot hold, so the result cannot be trusted. b.

What is the consequence for the hypothesis testing (t-tests and F-tests) if one finds that the variance of the error term is non-constant?

since we use least square method Then our beta value will no longer be efficient, because exist of lots of variance...


Similar Free PDFs