Title | Res-Econ TBL 4 Notes - Measures of Central Tendency - Mean, Median, Mode, Outlier Measures of Dispersion |
---|---|
Course | Introductory Statistics for the Social Sciences |
Institution | University of Massachusetts Amherst |
Pages | 3 |
File Size | 99.2 KB |
File Type | |
Total Downloads | 105 |
Total Views | 156 |
Measures of Central Tendency - Mean, Median, Mode, Outlier
Measures of Dispersion -Variance, Standard Dev., Range, Mean Absolute Deviation, Coefficient of Variation
Percentiles and Quartiles
Definitions, Steps, and Formulas provided.
Professor: Wayne Roy Gayle
...
Resource Economics 212 September 28 2016 Lecture/Textbook Notes TBL 4 Measures of Central Tendency Central Tendency o The middle or typical values of a distribution o Data on a graph can be skewed left, skewed right, or symmetric. The Mean o The sum of the data values divided by the number of data items. For population: Called the mean or expected value or average For sample: Called the sample mean or sample average Excel Command: =AVERAGE(Data) o = (x1+x2+x3+x4…)/total number of data items The Median o The midpoint of a set of sorted data. Separates the upper and lower half of the sorted observations Denotes the 50th percentile o Calculating the median: If the number of observations, n, is odd, then the median is the middle observation in the sorted data. If the number of observations, n, is even, then the median is the average of the middle two observations in the sorted data. Excel Command: =MEDIAN(DATA) Position of Median: Center value that divides ordered data into two halves (Number of observations + 1) / 2 The Mode o The most frequently occurring data value. Value that is most frequent in the data The frequency and relative frequency are the highest among all values A data set may have multiple modes or no mode at all. Excel Command: =MODE(Data) Outlier o A value that is higher or lower than the rest of the data values in an extreme way o Effects: Mean: affected, since it will be calculated into the mean Median: not affected, since the values do not matter for the median Mode: not affected, since an outlier wouldn’t be the most frequent one. Measures of Dispersion Dispersion
o Measures the level of variance in the data. o The spread of the data points about the center of the distribution of the data. Variance o Average of the squared distances between the data values and their mean. o Conceptualize: Measure difference between each observation to the mean Square each of the differences Sum the squared differences and divide by the appropriate factor Excel Function: =VAR(Data) o Calculations Population Variance σ2 = Σ (Xi - μ )2 / N Sample Variance s2 = Σ (xi - x )2 / ( n - 1 ) Standard Deviation o The square root of the variance o Why use it? For unit matching Excel Function: =STDEV(Data) Range o Difference between the maximum and minimum values in a data set. o Range = X(max) – X(min) Very sensitive to outliers Excel Command: =MAX(Data) – MIN(Data) Mean Absolute Deviation (MAD) o Measures the average of the absolute from the center. Absolute values must be used otherwise the deviation around the mean would sum to zero Less sensitive to outliers relative to the Range, Variance. o Calculations N
MAD=
∑ |x i−´x| i=1
n Excel Function: =AVEDEV(Data) Coefficient of Variation o Useful for comparing variables measured in different units or with different means. o A unit-free measure of dispersion. o Expressed as a percent of the mean. o Only appropriate for nonnegative data. It is undefined if the mean is zero or negative. o Calculation: CV = 100 * s/ ´x
Percentiles Percentiles o A value below which a certain percentage of the data fall. 55th percentile: the value below which 55% of the data falls. o Percentiles divide data into equal chunks Quartiles: 3 values that divide the data into 4 equal chunks Deciles: 9 values that divide the data into 10 equal chunks o Computing Quartiles Sort the raw data in ascending order (From low to high) Q1 is at position 0.25(n+1) Q2 is at position 0.50(n+1) Q3 is at position 0.75(n+1) Q1 leaves 25% of the values below it (it is the 25th percentile) Q2 is the median (50th percentile) Q3 leaves 75% of the values below it (75th percentile) Five Number Summary: o 1. Minimum Value o 2. Q1 o 3. Q2 o 4. Q3 o 5. Maximum Value Interquartile Range o The range within which the middle 50% of the data lie. IQR = Q3 – Q1 Quartiles – Key Distribution Shapes o Uniform o Bell-Shaped o Right Skewed o Left Skewed
TBL 4 Class Work on project with group and divide the tasks Made excel template to calculate mean, median, standard deviation, and quartiles. IRAT was on TBL 4....