Title | Shapes and distribution |
---|---|
Course | Psychology Statistics and Practical |
Institution | University of Kent |
Pages | 17 |
File Size | 1 MB |
File Type | |
Total Downloads | 67 |
Total Views | 136 |
normal distribution
skewness...
SP300: Shapes of Distributions Comparing measures of central tendency
Image: three examples of central tendency. The first is symmetrical, while the others show a clustering of scores et either end of the x axis. See slide 12 of last week’s lecture for more detail UK gross annual income • Median = £25948
Image: an example of a skewed distribution. The majority of incomes cluster around the lower end of the scale, meaning that the median is quite low relative to the full width of the scale (£0-250k+) Measures of dispersion: range and interquartile range • Range is very sensitive to outliers and skewed data • Interquartile range is less sensitive because we only look at the middle 50% of data
Image: a symmetrical (normal) distribution divided into quarters. The lowest and highest 25% of scores are excluded when considering the inter-quartile range. We can also call these percentiles, with the inter-quartile range being from the 25th to the 75th percentile Variance
• • • • •
The average amount that each case in the sample differs from the mean Widely used in psychological statistics ´ = mean X N = number of cases ∑ = sum what follows
Computational formula for variance
•
An alternative formula for working out variance by hand
• •
Gives the same result as the formula on the previous slide A useful shortcut but regular formula more instinctively expresses what variance is
Shapes of distributions • One thing you should always check is the way in which data on a particular variable is distributed • Many statistical tests assume a ‘normal distribution’ • From last week: shape of the distribution affects the mean/median and mode •
Major features of a distribution: • Central tendency • Variation • Skewness – whether the data is central, mostly to the left, or mostly to the right • Kurtosis – whether the distribution is especially flat or steep
Histograms • Histograms can vary in resolution
Image: The top image shows a steep curve, with large differences between column size. The lower image shows a smooth curve, with small differences between column size.
Frequency curve
Image: A histogram with a frequency curve. The frequency curve is a smoother illustration of the histogram which can make it easier to see the patterns in the data
Normal distribution
Image: A normal distribution curve. A normal distribution is symmetrical, so that data points cluster evenly about the mean, median, and mode, which will be the same number
Normal distribution (2) • It used to be thought that most characteristics in the natural world fit this pattern • Most statistical techniques have this as a basic assumption • But! You don’t need 100% normal distribution – an approximation is okay • Don’t forget: for normal distribution the mean, median and mode are equal Normal distribution: example
Image: overlapping frequency curves demonstrating height in men and women. Note that because they overlap, there are women with ‘male typical’ height, and men with ‘female typical’ height. Skewness •
How ‘lop-sided’ or asymmetrical the frequency curve is
Image: three bar charts showing negative skew (where scores cluster at the high end of the scale), a normal distribution, and a positive skew (where scores cluster at the low end of the scale)
Skewness (2)
• Negative skew: • More scores on the left of the mode than on the right • Mean and median are smaller than mode
Skewness (3) • Positive skew: • More scores on the right of the mode than on the left • Mean and median are larger than the mode
Kurtosis • Extent to which the frequency curve is excessively steep or flat
A flat (uniform) distribution
A steep distribution
Kurtosis (2) • Leptokurtic: A steep curve • Mesokurtic: A normal curve • Platykurtic: A flat curve
Bimodal distribution • Bimodal = 2 peaks • Multimodal = many peaks
• What variables might have a bimodal distribution?
Bimodal distribution (2) • Remember: the median and mean might not be at either peak
https://www.etsy.com/uk/listing/71739287/c ollection-of-10-distribution-plushies
Cumulative frequency distribution • Each point shows the number of cases that have that value or less • Should always be 100% at furthest right • Useful to look at, e.g. number of people falling below a target point
Percentiles • Percentile = % of cases that fall below a score in the distribution
• E.g. in a test you might rank people by percentile, i.e. how many did worse
Distribution shapes and data analysis...