Stats Exam #1 Study Guide on Chapter 1-3 PDF

Title Stats Exam #1 Study Guide on Chapter 1-3
Course Intro Statistical Analys
Institution Vanderbilt University
Pages 6
File Size 124.6 KB
File Type PDF
Total Downloads 51
Total Views 150

Summary

Stats Exam #1 Study Guide on Chapter 1-3...


Description

STATS CH. 1 - 3 Study Guide Chapter 1: Intro to Stats ● Statistics are used to organize & gather information for researchers to see what happened in the study, using a set of mathematical procedures Vocab: ● Population vs. Sample ○ Population = set of all individuals of interest in a study, entire collection of events/individuals that you are interests in studying, not possible to examine ○ Sample = set of individuals selected from a population, usually intended to represent the population in a research study (e.g. college students at vandy representing “college students”) ● Variable = a characteristic or condition that changes or has different values for different individuals ● Data = measurements or observations ● Parameter vs. Statistic ○ Parameter = a (typically numerical) value that describes a population, represented by a gr ○ Statistic = a (typically numerical) value that describes a sample, usually derived from measurements of the individuals ● Descriptive vs. Inferential statistic ○ Descriptive statistics = statistical procedures used to summarize, organize, and simplify data ○ Inferential statistics = techniques that allow us to study samples and make generalizations about their populations ● Sampling error =  there will always be some discrepancy when you try to represent a population using a sample, parameters will not be the same as the statistics (e.g. margin of error) ● Real limits = boundaries of intervals ● Nominal vs. Ordinal scale ○ Nominal scale = consists of a set of categories that have different names, measurements on a nominal scale label and categorize observations but do not make any quantitative distinctions between observations ■ E.g. people classified by race, gender, or occupation ○ Ordinal scale = consists of a set of categories that are organized in an ordered sequence, rank observations in terms of size and magnitude ■ Series of ranks (first, second, etc) or size (soda cup size) ■ Can’t determine the size of the difference

● Interval + ratio scale =  consists of a series of ordered categories ○ Interval - all intervals are exactly the same size and zero is arbitrary, zero is not “zero”, e.g. temperature or golf ○ Ratio = interval scale with the additional feature that a score of zero indicates none of the variable is being measured (height, weight, time to compete quiz, e.g. looking at +2 or -2 above or below the average of 0) ● Descriptive research = involves measuring one or more separate variables for each individual w/ the intent of simply describing the individual variables  two different variables are observed to determine whether ● Correlational method = there is a relationship between them ○ Demonstrating a relationship doesn’t provide an explanation for the relationship ● Construct & operational definition - constructs are what we want to study (e.g. harsh parenting), attributes/characteristics that can’t be directly observed but are useful for explaining and describing behavior vs. operational is measurement procedure for external behavior that is used to define the construct in specific consistent way (e.g. critical tone, or IQ tests for construct of “intelligence”) ● Discrete vs. continuous variables - discrete has separate/indivisible categories, between each value no other value can ever be observed (e.g. # of children per family) vs. continuous is when there’s an infinite number of possible values that fall between any two observed values (e.g. time) ● Experimental method - one variable is manipulated while another is observed and measured, control all other variables to establish a cause/effect relationship between 2 variables ○ Independent = variable manipulated by the researcher ○ Dependent = variable that we expect to change due to the effect of the manipulation “The MATH” Varaibles - X, Y, N, ∑, n ● x = each score for a variable ● X & Y = If you have two variables you measure for each participant, using X and Y (e.g. X = height, Y = weight) ● N = scores in population set ● n = scores in sample set ● ∑ = “sum” of ● ∑X = sum of scores

Summation notation ( refer to image) ● 4 - the upper bound, where you stop counting (b) ● f(n) - indexed variable and index of summation (Xi) ● n = 1 - the lower bound, where you start counting (i = a) Variations of summations: 1. ∑X - just add all numbers from lower to upper bound 2. (∑X )^2 - squared sum of scores, e.g. (x1 + x2 + x3)^2 3. ∑X^2 - sum of squared scores, e.g. (x1^2 + x2^2 + x3^2) Chapter 2: Frequency Distributions ● Frequency distribution = organized tabulation showing the number of individuals located in each category on the scale of measurement ○ Puts disorganized set of scores and places in order from highest to lowest, grouping individuals with the same score ● HOW?? ○ Use X as the column heading for scores ƒ for column heading of frequencies ○ Write out all numbers in the range even if the frequency is zero & always list from highest down to lowest ● To compute ∑X… ○ Add “manually” ○ OR multiply X by frequency and add products ● Proportion & percentages: ○ P = ƒ / N, percentage is the same x100 ○ Proportion measures the fraction of the total group associated with each score ● Grouped frequency distribution - create “class intervals” ○ Use it if the number of categories/possible scores is very large and if they are grouped to make the table easier to understand ○ Info is LOST when categories are groups - individual scores can’t be measured and the wider the grouping interval, the more info is lost ○ RULES ■ On average around 10 intervals, but change depending on what the intervals showing ■ “Width” = “distance” of the intervals, e.g. 2, 5, 10, 20s ■ All intervals are the same width ■ Bottom score is a multiple of width ■ No gaps or overlaps

○ *continuous variable* ??? - the interval 50-59 has real limits of 49.5 and 59.5, width of 10 points GRAPHS ● Histograms - list scores (measurements) along the x-axis and then draw a bar above each x value, the height of the bar corresponds to frequency, attached side by side ○ Block graphs = “simplified histogram”, draw lines at each measurement ● Polygons - list measurement categories on x scale and then a dot is centered above each score so that the vertical position of the dot corresponds to the frequency and a continuous line is drawn ● Bar graph - non numerical scores (nominal and ordinal data), indicates discrete categories, no particular order or width (non measurable) ● Shape of a frequency distribution ○ Central tendency - where the center of the distribution is located ○ Variability - degree to which the scores are spread, or clustered ○ Symmetrical (mirror image) vs. skewed distribution (piles up and tapers to one side) ■ Positively (piles up on the left) vs. negatively skewed (piles up on the right) CHAPTER 3: Mean, Median, & Mode 3-1 Overview ● Central tendency = the concept of an average or representative score, describe a distribution of scores by determining a single value that identifies the center of the distribution, the central value will be the score that is the best representative value for all of the individuals in the distribution (most typical/most representative) ○ E.g. weather - advantageous to be able to describe a large set of data with a single, representative number ○ BUT, there’s no single/standard procedure to determine central tendency ● 3 different methods for measuring central tendency: mean, median, mode 3-2 the Mean ● Mean = “average”, add all the scores in the distribution and divide by the number of scores ○ Mean for a population = µ (divide by N) ○ Mean for a sample = M or x-bar (divide by n) ● Can also be defined as “dividing the total equally”, amount each individual receives when the total is divided equally among the individuals ● “The mean as a balance point” - mean is the center of the “seesaw” when total above/below the mean is the same, look at distance from the mean



“The weighted mean” - when you combine sets of scores and find the overall mean for the combined group ○ Need two values: the overall sum of the scores for the combined group and the total number of scores in the combined group (n) ■ ∑X1 + ∑X2 / n1 + n2 ○ The overall mean is not halfway between the original two sample means bc one sample will inevitably make a larger contribution to the total group and therefore carry more weight -> that is why it is called the “weighted” mean ● Changes to the mean!! ○ Introducing a new score or removing a score will usually change the mean ■ Exception is when the new score or removed score is exactly equal to the mean ○ If a constant value is added to every score in a distribution, the same constant will be added to the mean (vis versa with subtracting a constant) ○ If every score in a distribution is multiplied or divided by a constant value, the mean will change in the same way 3-3 The Median ● Median = locate the midpoint of the distribution ○ If the scores in a distribution are listed from smallest to largest, the median is the midpoint of the list ○ Point on the measurement scale below which 50% of scores are listed/median is the first point you reach that is greater than 50% of the scores ● When N/n is an even number you select the pair of scores in the middle and average them ● Precise median ○ Fraction = number needed to reach 50% / number in the interval ○ *you can’t divide a discrete variable to find the precise median* 3-4 the Mode ● Mode = the most common observation among a group of scores, the score or category that has the greatest frequency ○ Casually used to refer to scores with relatively high frequences ● Useful measure of central tendency bc it can be used to determine the typical or most frequent value for any scale of measurement, including a nominal scale ● The only measure of a central tendency that corresponds to an actual score in the data ● MORE than one mode - bimodal & multimodal ○ A distribution with several equally high points is said to have no mode ○ Taller peak is the major mode, shorter one is the minor mode 3-5 Central Tendency and the Shape of the Distribution Symmetrical Distributions ● The right hand side of the graph is a mirror image of the left hand side - the median is exactly at the center, the mean is also exactly at the center ○ If a distribution if roughly symmetrical the mean and the median will be close together at the center of the distribution

○ If it only has one mode, the mode will also be at the center ○ But the distribution can be symmetrical without the mode being at the center Skewed Distributions ● Strong tendency for the mean, median, and the mode to be located at predictably different positions ○ ORDER: mode, median, mean (right to left vs left to right depends on if its positively vs. negatively skewed) ○ “Peak” is the mode & median is always to the right of the mode ○ The mean is typically located to the right of the median bc it is influenced by the most extreme scores in the tail 3-6 Selecting a Measure of Central Tendency ● Whenever the scores are numerical values (interval or ratio scale) the mean is usually the best measure ○ The mean is closely related to variance/standard deviation ● When to use the median… ○ Extreme scores or skewed distributions - the mean isn’t the best representative bc its thrown off by scores that are very different from most vs. the median isn’t easily affected by extreme scores ○ Undetermined values - when an individual as an unknown or undetermined score (e.g. study of how long it takes children to complete puzzles but a child never completes the puzzle), impossible to compute the mean ○ Open ended distributions - when there’s no upper limit or lower limit for a category it is “open ended”, impossible to compute a mean bc you can’t find ∑X ○ Ordinal scale - it is typically not appropriate to use the mean to describe ordinal data, median is compatible bc ordinal measurements allow you to determine direction not distance like the median ● When to use the mode… ○ Nominal scales ○ Discrete variables ○ Describing shape...


Similar Free PDFs