Res-Econ TBL 5 Notes - Chebyshev\'s Theorem - Normal Distribution, Empirical Rule, Z-Scores, Measures PDF

$Res-Econ TBL 5 Notes - Chebyshev\'s Theorem - Normal Distribution, Empirical Rule, Z-Scores, Measures$

Title	Res-Econ TBL 5 Notes - Chebyshev\'s Theorem - Normal Distribution, Empirical Rule, Z-Scores, Measures
Course	Introductory Statistics for the Social Sciences
Institution	University of Massachusetts Amherst
Pages	3
File Size	142 KB
File Type	PDF
Total Downloads	73
Total Views	137

Preview

CLICK TO PREVIEW PDF

Summary

Chebyshev's Theorem - Normal Distribution, Empirical Rule, Z-Scores, Measures of Association (Covariance, Correlation Coefficient) along with examples, formulas, and steps.
Professor: Wayne Roy Gayle...

Description

Resource Economics 212 October 6 2016 Textbook Notes TBL 5 Chebyshev’s Theorem  For any population (sample data) with mean and population (sample) standard deviation, the percentage of observations that lie within k standard deviations of the mean, k>1, must be: o At least 100[1-1/k^2]  EG: Assume k = 2 standard deviations, then 100[1-1/2^2] = 75%  So, at least 75% will lie within 2 standard deviations from the mean. Normal Distribution  A bell shaped curve that is symmetrical o Not all bell curves are normal  Completely characterized by its mean and standard deviation  Importance of the Normal Distribution o One of the most important distributions in natural and social sciences, because:  There are many variables htat closely follow the normal distribution  Examples: o Height of people o Errors in measurement due to imperfect instruments/observers o Test Grades  The average of many samples independently drawn from the same distribution are distributed nearly normal for a large enough sample size. This is a crude statement of the Central Limit Theorem The Empirical Rule  If data distribution is Normally distributed, then Empirical Rule states that we expect o the interval [µ - kσ µ + kσ] to contain a known percentage of data.  k = 1: 68.26% will lie within µ ± 1σ  k = 2: 95.44% will lie within µ ± 2σ  k = 3: 99.73% will lie within µ ± 3σ  If data distribution is Normally distributed, then the Empirical rule states that we expect: o +/- standard deviation from the mean to contain 68% of the data o +/- standard deviation from the mean to contain 95% of the data o +/- standard deviation from the mean to contain 99.7% of the data  Chebyshev’s Theorem vs. Empirical Rule o Chebyshev’s Theorem gives lower bound percentages for any distribution  EG: 75% or more of the data falls within 2 standard deviations from the mean.

o If the data is Normal, the Empirical Rule gives exact percentages  EG: 95% of the data falls within 2 standard deviations from the mean. Z-Scores – Standardizing Data  Definition o The number of standard deviations a value is from the mean. o Z = deviation from the mean (x – mean) / sample standard deviation (S) o Z-scores tell us the position of any observation relative to the mena. o Z-scores are Unit free  Inverse Z-Score o There is a one-to-one relationship between the z-score and data value. o Mathematically:  � = (�− )/s  =�×�+  Important Z-Scores o Z = -3: Three standard deviations below the mean o Z = -2: Two standard deviations below the mean o Z = -1: One standard deviation below the mean o Z = 0: Exactly at the mean o Z = +1: One standard deviation above the mean o Z = +2: Two standard deviations above the mean o Z = +3: Three standard deviations above the mean Visual Display – Bivariate Data  Bivariate data set is a data set consisting of two variables  Can be used to analyze the association between two variables visually or numerically.  Examples: o Multiple Bar Chart o Multiple Line Graph o Multiple Scatter Chart Measures of Association – Covariance  Definition o Measure of linear association between two variables. o Measures how two variables change together. o Positive Covariance:  Implies the variables are directly associated, or move together. o Negative Covariance:  Implies the two variables are inversely associated, or move in the opposite direction.  Calculating the Covariance o Covariance of a Population = Summation of (Yi – Y Mean)(Xi – X Mean) / N o Covariance of a Sample = Summation of (Yi – Y Mean)(Xi – X Mean) / n – 1 Measures of Association – Correlation coefficient  Definition



o Measure of the degree of linear association between two variables. o Gives more information than the covariance:  How closely related X and Y are.  A unit free number with range [-1, 1]  Value of -1 means perfect negative linear association.  Value of +1 means perfect positive linear association. Formula o Population Correlation Coefficient = (Covariance of X and Y) / Standard Deviation of X * Standard Deviation of Y  ��,� = ��,� / �� Y o Sample Correlation Coefficient = Assessed value by Sales Price / Assessed value of x and y  �� ,� = ��,� / �� Y...