Stat 13 lecture 4 PDF

Title Stat 13 lecture 4
Course Intro to Statistics
Institution University of California Davis
Pages 7
File Size 430.3 KB
File Type PDF
Total Downloads 77
Total Views 142

Summary

lecture 4...


Description



1/18/17





● ●

Mode - most frequently occurring values (s) ○ Can have 0,1, or > 1 mode ○ Data with 1 mode is called unimodal ○ Data with 2 modes is called bimodal (mixture) ○ Data with >2 modes is called multimodal Midrange ○ (max value + min value )/ 2 Mean, median, midrange - measures of “center” ○ 1. Sample means drawn from sample population tend to vary less than median and midrange ■ This makes sense because the mean is calculated using every value in the data whereas ● Midrange uses only 2 values in the data ( min and max) ● Median uses only 1 value (if n is odd) or 2 values (if n is even) ○ 2. If data has 1 or a few extreme values (very large or very small) ■ Can have a big effect on mean and midrange ■ The median doesn’t chance ○ Ex: income data in $1,000 ■ Data: 20, 50, 70, 80, 90 ■ Mean = 62

■ ■ ■



Midrange = (90+20)/2 = 55 Median = 70 Replace 90 with 300 in data ● 20, 50, 70, 80, 300 ● Mean = 104 ● Median = 70 ● Midrange = (300+20)/2 = 160 ● Mean and midrange have big change ○ Sensitivity to extreme ■ Midrange = most sensitive ■ Mean mildly sensitive ■ Median =least sensitive ■ When n is small, sensitivity rises Mean median, histograms ○ 1. If histogram is Symmetric around a point (necessarily the median) ■ Mean ≈ median ■ If all data values are perfectly symmetric ● Mean = median ○ 2. Histogram Skewed to right ■ N is large ■ Mean > median ■ Large values pull up the mean ○ 3. Histogram skewed to left ■ Mean<median ■ Small values pull down mean





Measures of variation ( or spread) ○ Measures of center (mean, median, midrange) tell you where center of data is; measures of variation tell you how spread out the data is ○ 1. Range = max value - min value ■ Very sensitive to extremes ■ Range = 2 midrange ○ 2. Variance





■ s^2 is estimate of population variance ■ Get “better “ estimate of population variance by dividing by n-1 3. Standard deviation

■ ○

Formulas show that both s^2 and s increase as the values increase which means variance is large when data values x1,...xn are far away from mean ■ Ex: data -6,-2,7,9 (n=6) ● Find standard deviation ● 1. Find mean = (6-2+7+9) / 4 = 5



2.



3. Find standard deviations

○ ●

Alternate formula for calculating variance (dont need to memorize)





Variance is approx equal to average squared distance between the xi values and mean



Properties of sample variance (s^2) and standard deviation (s) ○ 1. s^2 and s measure spread of data around sample mean ○ 2. s^2 ≥ 0 ■ look at terms in numerator (xi - mean)^2 those can’t be neg denominator (n-1) also > 0 ■ s^2 isn’t defined for n=1 ○ 3. If s^2 = 0 (s=0) then there’s no variation in data, meaning all values are the same number in the data ○ 4. As s^2 and s increases, the average distances between data values and mean increase ■ Look at terms in the numerator (xi - mean)^2 ■ As 丨xi-mean丨increase, so does (xi - mean)^2



○ ○

● ●

6. Units of s are same as scale of the data (minutes, feet, pounds, etc.) 7. Units of variance (s^2) are the square of original data ■ (ft)^2 , (min)^2 ○ 8. s^2 is an unbiased estimate of the population ■ “Unbiased” - not tend to overestimate or underestimate the population variance Xbar (mean) is an estimate of the population mean (which we call M “mu”) s^2 is an estimate variance of the population (which we call 𝜎^2 - sigma squared)





● ● ●

Population variance ○ (sigma)^2 Population standard deviation ○ (sigma) Population variance

○ ○

X1,…,xN N = number of members in population , n = number of members in sample



○ ○ ○ ○

● ●

Formula isn’t the same as we use for the sample variance s^2 For population we divide by N For sample we divide by n-1 Parameters are fixed numbers ■ Values don't vary from sample to sample ○ Statistics ■ Are random variable ■ Values vary from sample to sample How do we know how close (x bar) and M are ? In most applied statistical methods we can’t calculate M and 𝜎^2 because we very rarely have data on entire population M and/or 𝜎^2 may be ○ 1. Known from theoretical results ○ 2. Unknown - and we want to estimate them ○ 3. Hypothesized to be some specific value(s) and we want to use data to test if the hypothesized values are correct ○...


Similar Free PDFs