Statistics Module 10 Simple Linear Regression PDF

Title Statistics Module 10 Simple Linear Regression
Author Criselda Arcilla
Course accountancy
Institution Systems Plus College Foundation
Pages 6
File Size 208.3 KB
File Type PDF
Total Downloads 55
Total Views 171

Summary

For Statistics and Probability , Solving with Examples...


Description

Business Statistics: Module 10. Simple Linear Regression

Page 1 of 6

Module 10. Simple Linear Regression Regression analysis 

A parametric tool used to describe the linear relationship between the independent and dependent variables.



Develops a model to predict the values of the dependent variable based on the values of the independent variables.

Simple linear regression (SLR) 

Simplest type of regression analysis which involves one independent variable and done dependent variable in which the relationship between the two variables is estimated by a straight line



SLR formula is as follow: Y = a + bx or

Where



y = b o + b 1x

Y = dependent variable X = independent variable a or bo = y-intercept of the regression line; or the value of Y if X = 0 b or b1 = slope, or the unit change in Y for every unit change in X

To develop the linear regression model, you need to solve for the y-intercept and slope first; the formulas of which are as follow: Slope (b or b1) = nƩxy – ƩxƩy nƩx2 – (Ʃx)2

where n = number of samples

y-intercept (a or bo) = Ʃy – bƩx n 

In addition, coefficient of correlation and coefficient of determination are used in regression analysis to measure the strengths of relationship of the independent and dependent variables. These two measures were discussed in module 9. Coefficient of correlation (r) =

n(Ʃxy) – (Ʃx)*(Ʃy) nƩx2 – (Ʃx)2

*

* = multiply

nƩy2 – (Ʃy)2

Coefficient of determination = r2 For problem illustrations, let us use the same two problems in module 9.

Business Statistics: Module 10. Simple Linear Regression

Page 2 of 6

Problem illustrations 1. A group of independent researchers investigated the relationship between calories and fat of coffee drinks at two well-known coffee shops. The table below shows the result of the experiment. a. Determine the slope value. b. Determine the y-intercept value c. Develop the linear regression equation or model d. Predict the value of the fat of a coffee drink with calories of 43000. e. Describe the coefficient of correlation f. Describe the coefficient of determination Products K iced mocha swirl latte Z coffee Frappuccino blended coffee K coffee coolatta Z coffee iced coffee mocha expresso K mocha frappuccino blended coffee Z chocolate brownie frappucino blended coffee Z chocolate frapppucino blended crème Total

Calories (X) 240 260 350 350 420

Fat (Y) 8.0 3.5 22.0 20.0 16.0

X*Y 1920 910 7000 7700 6720

X2 57600 67600 122500 122500 176400

Y2 64 12.25 400 484 256

510

22.0

11220

260100

484

530 2660

19.0 110.5

10070 45540

280900 1087600

361 2061.25

In this particular module, we will set aside hypotheses, level of significance, critical values, decision rules, conclusions, and recommendations for the meantime. We will proceed with the computation. Initially, the table shown above has only three columns (written in blank font); and we added three more columns (written in red font) to complete our data matrix. For column X*Y, you need to multiply each value of X and Y, and you have to do it for every value in column X and Y. For column X^2, you multiply each value of X by itself, and you need to do it for every value in column X. For column Y^2, you multiply each value of Y by itself, and you have to do it for every value in column Y. Lastly, you need to get the total of each column. a. Slope (b or b1) =

7(45540) – (2660)(110.5) 7(1087600) – (26602)

= 0.05

b. Y-intercept (a or bo) = 110.5 – 0.046(2660) = -1.69 7 c. Linear equation model = Y = -1.69 + 0.05X since it’s a model, we retain the X d. Y = -1.69 + 0.05X = -1.69 +.05(300) = 13.31

Business Statistics: Module 10. Simple Linear Regression e. r =

n(Ʃxy) – (Ʃx)*(Ʃy)

Page 3 of 6

7(45540) – (2660)(110.5)

n(Ʃx2) – (Ʃx)2 * n(Ʃy2) – (Ʃy)2

= 0.7196

7(1087600) –(2660)2 * 7(2061.25) –(110.5)2

The r-value of 0.7196 reflects the high positive relationship of the calories and and fat of coffee drinks f. r2 = 0.71962 = 0.5178 or 51.78%, which means 51.78% of the changes in the fat of the coffee drinks can be explained by its interaction with the calories of the coffee drinks, and the remaining 48.22% are unexplained factors or factors not included in this study. 2. The store manager wants to determine the relationship between the number of weekend television commercials shown and the sales of stereo and sound equipment at the store. The table below shows the gathered data. a. Determine the slope value. b. Determine the y-intercept value c. Develop the linear regression equation or model d. Predict the sales for 8 TV weekend commercials e. Describe the coefficient of correlation f. Describe the coefficient of determination Week 1 2 3 4 5 6 7 8 9 10 Total

Number of TV weekend commercials (X) 3 6 2 4 5 2 5 3 4 1 35

a. Slope (b or b1) =

Sales (Y) ($100s) 52 58 43 55 56 40 64 49 60 39 516

10(1916) – (35)(516) 10(145) – (35)2

X*Y 156 348 86 220 280 80 320 147 240 39 1916

X^2 9 36 4 16 25 4 25 9 16 1 145

Y^2 2704 2264 1849 3025 3136 1600 4096 2401 3600 1521 27296

= 4.89

b. Y-intercept (a or bo) = (516 -4.89(35) = 34.49 10 c. Linear regression model = Y = 34.49 + 4.89X d. Y = 34.49 + 4.89X = 34.49 + 4.89(8) = 73.61 e. r =

n(Ʃxy) – (Ʃx)*(Ʃy) 2

2

2

10(1916) – (516)(1916) 2

n(Ʃx ) – (Ʃx) * n(Ʃy ) – (Ʃy)

2

10(1916) – (516 ) *

= 0.8956 2

10(27296) – (1916 )

Business Statistics: Module 10. Simple Linear Regression

Page 4 of 6

There is a high positive correlation between number of weekend television commercials and sales of stereo and sound equipment.

f. r2 = 0.89562 = 0.8021 or 80.21%, which means 80.21% of the changes in the sales of stereo and sound equipment of the store can be explained by the changes in number of weekend television commercials and the 19.79% are caused by other unexplained factors.

End of Module Exercises Read and analyze the problems carefully. 1. A production manager has compared the dexterity test scores of seven assembly line employees with their hourly productivity. The table below shows the result. a. Determine the slope value. b. Determine the y-intercept value c. Develop the linear regression equation or model d. Predict the hourly productivity for a dexterity score of 20 e. Describe the coefficient of correlation f. Describe the coefficient of determination Employee A B C D E F G Total

Dexterity test score (X) 13 15 18 17 12 14 16

Hourly productivity 56 64 68 71 52 62 64

2. It has been reported that the average American male consumes 3774 calories per day and that 72.2% of American males are overweight. This information along with data for seven other countries, is shown below. a. Determine the slope value. b. Determine the y-intercept value c. Develop the linear regression equation or model d. Predict the % overweight for American male who consumes 4000 calories per day e. Describe the coefficient of correlation f. Describe the coefficient of determination Country A B C D E F G Total

Calories per day 2214 2975 3458 3523 2560 3257 3885

% overweight 10.6 27.7 40.4 64.7 62.8 15.3 70.0

Business Statistics: Module 10. Simple Linear Regression

Page 5 of 6

3. In a certain company, employees who have stayed for more than five years with the firm are given the opportunity to own a stock share at a discounted price as a form of reward. The personnel manager wanted to know if the employees’ number of years in service with the firm has influence on number of stock shares own? Describe a. Determine the slope value. b. Determine the y-intercept value c. Develop the linear regression equation or model d. Predict the number of stock shares own of an employee with 20 years in service with the firm. e. Describe the coefficient of correlation f. Describe the coefficient of determination Employee A B C D E F G H Total

No. of years (X) 7 13 15 7 10 14 16 10

No. of stock shares (Y) 315 418 570 273 300 665 660 310

4. The Insurance Institute for Highway Safety has listed the following ratings based on collision and comprehensive claims for nine makes of midsize four-door cars from 2014-2016 model years. Higher numbers reflect higher claims in the collision and comprehensive categories of coverage. a. Determine the slope value. b. Determine the y-intercept value c. Develop the linear regression equation or model d. Predict the comprehensive claim rating for a collision rating of 130 e. Describe the coefficient of correlation f. Describe the coefficient of determination Claims 1 2 3 4 5 6 7 8 9 Total

Collision claim rating (X) 113 108 124 131 128 90 99 106 116

Comprehensive claim rating (Y) 89 91 92 108 108 74 79 86 98

5. A mail-order catalog business that sells personal computer supplies, software, and hardware maintains a centralized warehouse for the distribution of products ordered. Management is currently examining the process distribution from the warehouse and is interested in studying the factors that affect warehouse distribution costs. Currently, a small handling fee is added to the order, regardless of the amount. The

Business Statistics: Module 10. Simple Linear Regression

Page 6 of 6

table below shows the data (distribution costs in thousands of dollars and number of orders) collected for the past 12 months. a. Determine the slope value. b. Determine the y-intercept value c. Develop the linear regression equation or model d. Predict the number of orders for a distribute costs of 90.25 (in thousand dollars) e. Describe the coefficient of correlation f. Describe the coefficient of determination Month 1 2 3 4 5 6 7 8 9 10 11 12 Total

Distribution costs 52.95 71.66 85.58 63.69 72.81 68.44 52.46 70.77 82.03 74.39 70.84 54.08

Number of orders 4015 3806 5309 4262 4296 4097 3213 4809 5237 4732 4413 2921

References Albright, S. et al. (2015). Business analytics: data analysis and decision making (5th ed). Cengage Learning. Anderson, D., Sweeney, D.J., et.al., (2018). Modern business statistics. Australia: Cengage Learning. Antivola, H. (2015). Business statistics: a modular approach. Books Atbp. Publishing. Anywhere Math. (2016). Introduction to Statistics. https://www.youtube.com/watch?v=LMSyiAJm99g. Berenson, M.L., Levine, D.M., & Krehbiel, T.C. (2015). Basic business statistics: concepts and applications. Pearson Education Sou7th Asia Pte. Ltd. Bowerman, B. (2017). Business statistics in practice: using modeling, data, and analytics (8th ed.). McGraw-Hill Education. Jaggia, S. (2019). Business statistics: communicating with numbers (3rd ed.). McGrawHill Education. Lee, N. (2016). Business statistics: using excel & SPSS. Sage. Mukaka, M.M. (2012). A guide to appropriate use of correlation coefficient in medical research. Malawi Medical Journal, v.24(3). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3576830/ Simple Learning Pro. (2015). Mean, median, mode, range, and standard deviation. https://www.youtube.com/watch?v=mk8tOD0t8M0. Sharpe, N. (2015). Business statistics 3rd ed. Pearson Education. Willoughby, D. (2015). An essential guide to business statistics. John Wiley & Sons....


Similar Free PDFs