Week 2 Linear Regression Analysis - tutorial worksheet PDF

Title Week 2 Linear Regression Analysis - tutorial worksheet
Course Introduction To Statistical Reasoning
Institution Monash University
Pages 4
File Size 258.8 KB
File Type PDF
Total Downloads 47
Total Views 204

Summary

Week 2 worksheet of weekly tutorial sheet. Learnt about association, correlation, residual and regression analysis. Class from 2020....


Description

SCI1020: Introduction to Statistical Reasoning

WEEK 2: LINEAR REGRESSION EXPLORING DATA- Relationship between two Quantitative Variables Student's Name:

Tutorial Day/Time:

PRELIMINARY READING: D S Moore et al, “Basic Practice of Statistics”, Chs 4-5. On completion of this workshop you should be able to: 1. Produce a scatterplot of quantitative data with appropriate explanatory and response axes; 2. Recognise a linear pattern and the general formula for a straight line; 3. Calculate a predicted value given the equation of the linear regression line; 4. Add a linear line of best fit to data using MS EXCEL, and describe the regression line (equation and correlation); 5. Assess the closeness of fit using the least-squares criterion as reflected in the correlation coefficient; 6. Obtain residual values and interpret their size and distribution about the line in the form of a residual plot; 7. Find or calculate and interpret the squared correlation, r2.

PRELIMINARY QUESTIONS: These problems are to help you engage with the lecture material, and also to make sure that everyone is upto-speed before the workshop starts. Please make sure you do them before class each week!

Q.1 State in your own words what is meant by each of the terms listed below. Be specific. Term

Definition

Explanatory variable

The explanatory variable is also known as the independent variable. It is ususally manipulated to influence the change in a response variable.

Response Variable

The response variable is also known as the dependent variable. Its change is explained/affected by the explanatory variable; it measures the outcome of a model.

Association

Direction of the trend (positive, negative, none); a relationship between two random variables that are statistically dependent on each other.

Correlation

Measure of how strong the association is between the two variables. This is quantified by the correlation coefficient "r".

Regression line

A linear line that describes how the dependent variable (y) changes as the independent variable (x) changes; line of best fit

Residual

Difference between the actual value of y and the y value predicted by the regression line for each x value of the data. It is calculated by the formula [residual = y - y]

Q.2 What is the general equation of a straight line? Define all the terms in the equation. The general equation of a straight line is y = mx + c, where y is the response variable, m is the slope, x is the explanatory variable and c is the y-intercept (where x = 0).

Week 2

Copyright 2019: Monash University

Page | 1

Q.3 Do Q5.2 from Moore et al text, p.130. What is the regression line equation based on the description of the trend in this example? R = No. of individuals taking up regular running exercise, C = No. of cigarettes smoked daily Intercept = 48 million Slope = -0.178 (for each R, C decreases by 0.178) Hence the regression line equation is C = - 0.178R + 48 million

WORKSHOP PROBLEMS: Q.4 Demonstration of correlation and least squares regression. a) Go to the website http://digitalfirst.bfwpub.com/stats_applet/stats_applet_5_correg.html (Note that spaces in the URL are underscores_ ). Create a scatterplot of linear trend (similar to plot #1 below. Observe the size of the correlation coefficient for different scatter patterns. Use “Draw your own line” to draw a line of best fit. Change the intercept and slope, trying to minimise the sum of the squares of the residuals as shown by the “relative SS” value. Compare yours with the “Show least-squares line” which is placed by calculation. No written answers are required here just observe the values. b) Describe the relationship in the x-y data plotted below: 3.

5. Change in pulse rate with exercise

Quiz score vs chocolate consumption

1.

2.

120

Measured radioactive decay

140

1400

120

1200

100

1000

80

800

60

600

40

400

20

200

100 80 60 40 20

0

0

0 0

50

100

150

200

250

300

0

20

40

60

80

100

Pulse rate before exercise (beats per minute)

Daily Chocolate consumption (g)

120

0

5

10

15

Time (mins)

Identify the association (positive/negative/none) and correlation (strong/moderate/weak/none) present. PLOT

1

2

3

Association

None

Positive

Negative

Correlation

None

Strong

Moderate

Estimate r (If approp.)

0

0.9...


Similar Free PDFs