0. Supplementary Course Packet (2nd Edition) PDF

Title 0. Supplementary Course Packet (2nd Edition)
Author Kendall Sturgill
Course Econ & Bus Statistics
Institution University of Kentucky
Pages 107
File Size 5.5 MB
File Type PDF
Total Downloads 16
Total Views 141

Summary

Download 0. Supplementary Course Packet (2nd Edition) PDF


Description

ECO 391 Economic and Business Statistics

University of Kentucky Course Package: Second Edition Designed by: Dr. Alejandro Dellachiesa Philip Meersman

Table of Contents Chapter 3: Measures of Central Location and Dispersion (page 5) 1. Mean 2. Median 3. Mode 4. Weighted Mean 5. Range 6. Variance and Standard Deviation 7. Coefficient of Variation 8. Covariance 9. Correlation Coefficient Chapter 7: Sampling and Sampling Distribution (page 12) 1. Simple Sampling 2. Convenience Sampling 3. Systematic Sampling 4. Cluster Random Sample 5. Stratified Random Sample 6. Selection Bias 7. Non-response Bias Chapter 6: Normal and Standard Normal Distribution (page 19) 1. Discrete and Continuous Variables 2. Normal Distribution and Normal Standard Distribution 3. The Empirical Rule 4. Class Examples Chapter 9: Hypothesis Testing (page 26) 1. Formulating the H0 and the HA hypothesis 2. One-Tailed versus Two-Tailed Tests 3. Type I and type II errors 4. Developing the null (H0) and alternative (HA) Hypothesis 5. Choose the level of significance (α) 6. Calculate the value of the test statistic and the P-value 7. Draw a conclusion and interpret the results Chapter 14: Simple Regression Model (page 34) 1. Independent and Dependent Variables 2. Simple versus Complex Relationships 3. Positive versus Negative Relationships 4. Strong versus Weak Relationships 5. Linear versus Non-linear Relationships 6. Deterministic and Stochastic Models 7. Simple Regression Analysis (Stochastic Models)

2

Chapter 14: Simple Regression Model (page 41) 1. Simple Regression Model (Tips and Meals) 2. Sum of Squared Residuals (SSE) 3. Simple Regression Model (Tips and Bills) 4. The Centroid 5. The Line of Best Fit 6. The Coefficient Estimates β0 and β1 Chapter 14: Simple Regression Model (page 46) 1. Calculating the Coefficient estimates ^β 0 and ^β 1 2. Comparing models I and II 3. Calculating the Sum of Squared of the Errors (SSE) 4. Calculating ´Y and ^ Y Chapter 12: The Coefficient of Determination: The R-squared (page 48) 1. The Coefficients Estimates (β0 and β1) 2. The R-Squared 3. The SSR, SSE, and SST 4. The SSE (the Sum of the Squared Errors) 5. The R-Squared and the Adjusted R-Squared 6. Spurious Regressions Chapter 15: Multiple Regression Model (page 53) 1. Simple and Multiple Linear Regression Analysis 2. Variable Selection in Multiple Regression Models Installing Data Analytics on your Computer 3. Model 1, Model 2, and Model 3 4. Looking at the R2 and the Adjusted R2 5. Comparing Models Chapter 15: Multiple Linear Regression Model (page 58) 1. Case Study: The Butler Trucking Company 2. Simple Linear Regression Models: Interpreting ^ B0 , and ^ B1 3. Multiple Linear Regression Models Interpreting ^ B0 , ^ B1 , and ^B2 4. Practice Exercise 5. Interpreting the Coefficients Chapter 15: Dummy Variables (page 64) 1. Quantitative and Qualitative (Categorical) Information 2. Quantifying Qualitative (Categorical) Information 3. Models with Two Categories 4. Models with More than Two Categories 5. Combining Qualitative and Quantitative Data

3

Chapter 8: Confidence Intervals (page 73) 1. What is the Purpose of a Confidence Interval? 2. Constructing a Confidence Interval 3. What Factors Determine How Wide is the Interval? 4. Calculating a Confidence Interval Non-linear models: Percentage and Percentage Point Changes (page 78) 1. Calculating Percentage Point and Percentage Change. 2. Percentage Change: Using Natural Logs. 3. Percentage Points: Using Percentages and Decimals. 4. Developing your Intuition. 5. Quadratic Regression Models. Finding the P-value in Hypothesis Testing: Calculating the P-value (page 83) 1. Difference Between t-values and P-values 2. Why using P-values 3. Left and right Tail Test Finding the F-values in Hypothesis Testing: Calculating the F-value (page 87) 1. SST, SSR, and SSE 2. R-squared and Adjusted R-squared 3. t-values and P-values 4. Confidence intervals 5. F-values and the F-distribution Chapter 15: Multicollinearity (page 91) 1. Define Multicollinearity and describe its consequences. 2. How Problematic is Multicollinearity. 3. When Multicollinearity becomes harmful. 4. Checking for Multicollinearity. Heteroskedasticity (page 95) 1. Define Heteroskedasticity and describe its consequences. 2. Understand Heteroskedasticity and Homoskedasticity. 3. Identify when Heteroskedasticity is harmful. 4. Fix Heteroskedasticity problems. Chapter 16: Autocorrelation (page 99) 1. Defining autocorrelation and describing its consequences. 2. Testing for autocorrelation (also called serial correlation). 3. Correcting statistical models when autocorrelation is present.

4

Chapter 3: Measures of Central Location and Dispersion Outline: Measures of Central Location 1. Mean 2. Median 3. Mode 4. Weighted Mean Measures of Variability 5. Range 6. Variance and Standard Deviation 7. Coefficient of Variation Measures of Association 8. Covariance 9. Correlation Coefficient (Read pages 110 to 114 and 128 to 132 of the paper version of the textbook)

Measures of Central Location

1. The Mean The mean of a data set is the ___________________ of all the data values. The sample mean ___________ is the point estimator of the population mean ____________.

Sample Mean ´x =

´x

∑ xi n

Where:

∑ xi

= sum of the values of the n observations n = number of observations in the sample 5

Population Mean μ=

´x

∑ xi N

Where:

∑ xi

= sum of the values of the N observations N = number of observations in the population

Example: Sample Mean

´x

Consider the salaries of employees at Alcoa. Online, you find out that the average salary for that company is over $150,000!

¿ ¿ ¿ ¿ ¿ ¿ ¿

´x =

∑ xi = ¿ n

¿

This mean __________________________ the typical salary because it is susceptible to outliers!

6

2. The Median The median of a data set is the value ___________________ when the data items are arranged in ascending order. Whenever a data set has extreme values, ________________ is the preferred measure of central location.

For an odd number of observations: 26

18

27

12

14

27

19

12

14

18

19

26

27

27

Median ____________________

For an even number of observations:

26

18

27

12

14

27

19

30

12

14

18

19

26

27

27

30

Median ____________________

3. The Mode 7

The mode is the _______________________________ in a data set.

The greatest frequency can occur at two or more different values. A data can have no mode, one mode ( ____________ ), or many modes ( ____________ ).

Example: Consider the salaries of employees at Alcoa

The mode is ________________ since this appears most often.

Excel’s Mean Function

= AVERAGE (data cell range)

Excel’s Median Function

= MEDIAN (data cell range)

Excel’s Mode Function

= MODE.SNGL (data cell range)

4. The Weighted Mean

The weighted mean is relevant when some observations contribute more than others. Example: In most classes, different assignments have different weights. Let W1, W2, …, Wn denote the weights of the assignments X1, X2,…, Xn such that: W1 + W2 +…+ Wn = 1.

8

The weighted mean for the sample is computed as:

Measures of Variability

5. The Range

The range of a data set is the difference between _____________ and _____________ data values. Example: Monthly Starting Salary

Range = _________________________

6. The Variance and the Standard Deviation The variance is the _______________ of the squared differences between each data value and the ______________.

The variance is computed as follows:

9

The standard deviation of a data set is the positive _____________ of the variance. It tells us how measurements for a group are spread out from mean or the expected value.

Exercise: Calculate the Variance and the Standard Deviation. 5

Variance: _____________________

Standard Deviation: ______________________

7. The Coefficient of Variation 10

The coefficient of variation indicates how large _________________ is in relation to the mean. The coefficient of variation is computed as follows:

Variance: _____________________

Standard Deviation: _______________________

Coefficient of Variation: ________________________

Excel’s Variance Function

= VAR.S(data cell range)

Excel’s Standard Deviation = STDEV.S(data cell range)

Excel’s CV Function

= STDEV.S(data cell range) / AVERAGE (data cell range)

8. The Covariance and Correlation Coefficient The covariance (SYX or SXY) describes _______________ of the linear relationship between two variables, X and Y. The correlation coefficient (RXY or RYX) describes both _________________________ of the relationship between two variables, X and Y.

The coefficient of variation for a population is computed as follows: 11

Chapter 7: Sampling and Sampling Distribution Outline: Describe various sampling methods 1. Simple Sampling 2. Convenience Sampling 3. Systematic Sampling 4. Cluster Random Sample 5. Stratified Random Sample Explain the Most Common Sample Biases 6. Selection Bias 7. Non-response Bias (Read pages 350 to 352 of the paper version of the textbook)

Two branches of Statistics: 1. Descriptive statistics are brief descriptive coefficients that summarize a given data set, which can be either a representation of the entire population or the sample. 2. Inferential statistics are used by taking a random sample of data from a population to describe and make inferences about the population.

Population: consists of all items of interest in a statistical problem.

Sample: a subset of the population. The Sample Statistic is calculated from the sample and can be used to make _______________ about the population. 12

Why do we take samples and not survey the whole population? 1. Obtaining information on the entire population is __________________

2. In many cases it is impossible to ___________________________________

Notice: A survey that measures the entire target population is called a ___________

Describe various sampling methods 1. Simple Sampling 2. Convenience Sampling 3. Systematic Sampling 4. Cluster Random Sample 5. Stratified Random Sample

1. Simple Random Sample

A simple random sample is a sample of n observations that has the same probability of being selected from the population as any other sample of n observations. It is theoretically the ideal method of sampling.

13

Advantages:

________________________________________________________________

________________________________________________________________

Disadvantages:

________________________________________________________________

________________________________________________________________

2. Convenience Sampling

Randomly survey students walking on campus or shopping mall. Take the next 20 objects off the production line.

Advantages:

14

________________________________________________________________

________________________________________________________________

Disadvantages:

________________________________________________________________

________________________________________________________________

3. Systematic Sampling

Choose a starting point randomly and then systematically take objects at a certain number apart.

________________________________________________________________

Advantages:

________________________________________________________________

________________________________________________________________

Disadvantages: 15

________________________________________________________________

________________________________________________________________ 4. Cluster Random Sample

Divide the population into mutually exclusive and collectively exhaustive groups, called clusters. Randomly select the clusters and then sample every observation in those randomly selected clusters.

Advantages:

________________________________________________________________

________________________________________________________________

Disadvantages:

________________________________________________________________

________________________________________________________________

5. Stratified Random Sample 16

Divide the population into mutually exclusive and collectively exhaustive groups, called strata. Randomly select observations from each stratum, which are proportional to the stratum’s size.

Advantages:

________________________________________________________________

________________________________________________________________

Disadvantages:

________________________________________________________________

________________________________________________________________

Explain the Most Common Sample Biases

Selection Bias: A systematic exclusion of certain groups from consideration for the sample. The sample is ________________________ of the population intended to be analyzed.

17

Your conclusion or policy recommendation here:

______________________________________________________________

______________________________________________________________

Non-response Bias: A systematic difference in preferences between respondents and nonrespondents to a survey or a poll.

Ideally, we want to create an Unbiased Sample: The idea is for each object in the population to be equally likely to be chosen as part of the sample.

The sample should also be representative of the population (if the population was 2/3 > 35 years old and 1/3 ≤ 35 years old then the sample should be equally split)

18

Chapter 6: Normal and Standard Normal Distribution Outline: 5. 6. 7. 8.

Discrete and Continuous Variables Normal Distribution and Normal Standard Distribution The Empirical Rule Class Examples

(Read pages 140 to 141 and 291 to 300 of the paper version of the textbook)

1. Discrete and Continuous Random Variables

Discrete: The random variable assumes a ________________ number of distinct values. e.g. ___________________

Continuous: The random variable is characterized by ___________________ values within any interval. e.g. ___________________

2. Normal Distribution and Standard Normal Distribution

Characteristics of a Normal Distribution: a)

b)

19

c)

Characteristics of a Standard Normal Distribution:

a)

b)

c)

d)

e)

20

3. The Empirical Rule:

The empirical rule 68–95–99.7

The Standard Normal Transformation: Any normally distributed random variable X with mean � and standard deviation σ can be transformed into the standard normal random variable Z as:

Changing an x-value to a z-value: The Z-formula

Z=

x−μ σ

Steps: 1) Take your x-value and subtract the mean (μ) 2) Divide it by the standard deviation (this gives you the z-value or z-score)

21

The Standard Normal Transformation: (Example)

22

The Standard Normal Transformation: (Exercise #1) a. Assume the ECO 202 examination scores were normally distributed with a μ = 75 and σ = 10. What is the approximate z-score that correspond to an exam score of 85?

Z=

x−μ σ

b. What is the probability of randomly selecting an exam with a grade above 85?

(Exercise #2) a. ECO 202 examination scores were normally distributed with a μ = 75 and σ = 10. What is the approximate z-score that correspond to an exam score of 90.2?

23

Using the Empirical Rule:

(Exercise #3) Assume that the mean weight of Argentinean male students is normally distributed with a mean of about 95 kilograms and a standard deviation of approximately 11 kilograms. Without using a calculator, estimate the percentage of male students that meet the following conditions. Draw a sketch and shade the proper region for each problem. a. Less than 84 kilograms

24

b. Between 73 and 117 kilograms

c. More than 128 kilograms

25

Chapter 9: Hypothesis Testing Outline: The Null Hypothesis (H0) and the Alternative Hypothesis (HA) 3. Formulating the H0 and the HA hypothesis 4. One-Tailed versus Two-Tailed Tests 3. Type I and type II errors Performing the Hypothesis Testing 1. 2. 3. 4.

Developing the null (Ho) and alternative (HA) Hypothesis Choose the level of significance (α) Calculate the value of the test statistic and the P-value Draw a conclusion and interpret the results

(Read pages 407 to 410 and 413 to 422 of the paper version of the textbook)

1. Formulating the H0 and the HA hypothesis The null hypothesis, denoted by H0, is a tentative assumption about a population parameter. (What is currently believed to be true) The alternative hypothesis, denoted by HA, is the opposite of what is stated in the null hypothesis. (It is the researcher’s hypothesis)

Example #1: A new teaching method is developed that is believed to be better than the current method.  Null Hypothesis (H0): Students’ average study time per week at UK equals the national average of 14 hours a week. _____________________________________________________________  Alternative Hypothesis (HA): Students’ average study time per week at UK differs from the national average of 14 hours a week. ______________________________________________________________ 26

2. One-Tailed and Two Tailed Tests The equality part of the hypotheses always appears in the null hypothesis (H0). In general, a hypothesis test about the value of a population mean μ must take one of the following three forms: (where μ0 is the hypothesized value of the mean)

One-tailed (left-tail test) H 0 : μ ≥ μ0 H A : μ < μ0

One-tailed (right-tail test) H 0 : μ ≤ μ0 H A : μ > μ0

Two-tailed test H 0 : μ = μ0 H A : μ ≠ μ0

3.

Type I and type II errors

Type I error: Committed when we reject H0 when H0 is actually true. Type II error: Committed when we do not reject H0 when H0 is actually false. 27

Incorrect Decision: Reject H0 when H0 is true (Type I error). Incorrect Decision: Do not reject H0 when H0 is false (Type II error).

Performing a Hypothesis Testing Hypothesis testing enables us to determine whether the sample evidence is inconsistent with what is hypothesized under H0 (what we currently believe). First assume that H0 is true and then determine if sample evidence contr...


Similar Free PDFs