P21500 Lecture Part 1 PDF

Title P21500 Lecture Part 1
Course Applied Statistics
Institution The City College of New York
Pages 14
File Size 846 KB
File Type PDF
Total Downloads 26
Total Views 135

Summary

Download P21500 Lecture Part 1 PDF


Description

Department of Psychology

Windows & SPSS Experimental and Statistical Procedures

Arthur D. Lynch, Ph.D. Department of Psychology NA 7210 Director, Psychology Computer Facility NA 6105 City College of New York Convent Ave. @ 136th Street New York, NY 10031 (212) 650-8460 email: [email protected]

Statistical Decision by Design Analysis and Data Level Instructions: To determine the appropriate descriptive statistic and its corresponding inferential test, select the data level (column selection) of the dependent variable for the type of statistical design (row selection) for the independent variable where there are two levels of the independent variable (i.e., experimental conditions). Design Analysis Design Statistics / Scale Central Tendency Between Subject’s & Variability Design

Data Level is at least: Nominal a Ordinal Mode Median & & Variation ratio SIQR

Interval Mean & Standard deviation

Ratio Mean & Standard deviation

t for Independent Sample Mean b

t for Independent Sample Mean

t for Independent Sample Mean

Median & SIQR t for Dependent Sample Mean c

Mean & Standard deviation t for Dependent Sample Mean

Inferential Test Statistic

χ2 for Mode

Within Subject’s Design

Central Tendency & Variability Inferential Test Statistic

Mode & Variation ratio Cochran Q

Relational

Correlation Inferential Test Statistic

Phi (Φ) χ2 for Φ

Spearman Rank r t for Spearman r

Pearson r t for Pearson r

Pearson r t for Pearson r

Shape

Skew & Kurtosis Inferential Test Statistic

NA NA

g1 & g2 d Z for g1 & g2

g1 & g2 Z for g1 & g2

g1 & g2 Z for g1 & g2

a b c d

Mean & Standard deviation t for Dependent Sample Mean

Binomial classification with two categories Nonparametric Inferential Test – Mann-Whitney U test if not equal appearing interval scale Nonparametric Inferential Test – Wilcoxian T test if not equal appearing interval scale Assumes equal appearing interval scale

6/1/2013

Page 2/14

Measurement Scales Nominal Scale: Scale in which objects are classified simply as being different or the same. 1. Examples - diagnostics, license plates, social security number. 2. Formal properties: Equivalence i.e., the members of any one subclass must be equivalent in the property being scaled and therefore, (a) reflexive i.e., x = x for all values of x; (b) symmetrical i.e., if x = y, then y = x; (c) transitive i.e., if x = y and y = z, then x = z. 3. Admissible Operations: unique up to a 1 to 1 transformation (i.e., symbols for subclasses may be interchanged in a consistent manner) 4. Permissible Descriptive Statistic(s): mode & variation ratio *. Ordinal Scale: Scale in which objects are classified as being different or the same (nominal property) and as having a relation among themselves. 1. Examples: opinions (agree thru disagree), military ranks, seniority, house numbers. 2. Formal properties: All those properties of a nominal scale, and greater than, less than (a) irreflexive i.e., it is not true for any x that x > x (b) asymmetrical i.e., if X > Y, then y x (c) transitive i.e., if x > y, and y > z, then x > z 3. Admissible Operations: unique up to a monotonic transformation (i.e., symbols for subclasses may be exchanged as long as the order of the symbols maintains the same ranking) 4. Permissible Descriptive Statistic(s): mode & variation ratio, median & SIQR *. Interval scale: Scale in which objects are classified as being different or the same, objects are ranked, and the distance between any two points on the scale is of known size. 1. Examples: Fahrenheit and centigrade temperature, IQ scale, many “formal” Likert type psychological test scales. Note that the differences between the scale points are isomorphic to the structure of arithmetic, e.g., F = 9/5 C + 32 or C = F -32 x 5/9 Fahrenheit 32 50 86 212; (86 - 50) / (50 - 32) = 36 / 18 = 2.0 Centigrade 00 10 30 100; (30 - 10) / (10 - 00) = 20 / 10 = 2.0 2. Formal properties: All those properties of nominal and ordinal scales, and (a) the ratio of any two intervals is independent of the unit of measurement (b) the ratio of any two intervals is independent of any zero point (arbitrary 0 point) 3. Admissible Operations: unique up to a linear transformation. (i.e., f(x) = aX + b). 4. Permissible Descriptive Statistic(s): mode & variation ratio, median & SIQR, mean & standard deviation *. Ratio Scale: Scale in which objects are classified as being different or the same, objects are ranked and with known intervals, as well as, having a true zero point. 1. Examples: weight, time, latency and speed scores, frequency of behavior(s) 2. Formal properties All those properties of nominal, ordinal and interval scales, and (a) has a true zero point (b) the ratio of any two points is independent of the unit of measurement 3. Admissible Operations: unique up to a multiplication by a positive constant where only the unit of measurement is arbitrary (i.e., ratio between any two scale points is preserved when scale values are multiplied by a positive constant). 4. Permissible Descriptive Statistic(s): mode & variation ratio, median & SIQR, mean & standard deviation *. * Descriptive statistics in bold reflect the most powerful & usually preferred descriptive for the scale

6/1/2013

Page 3/14

Nonparametric Analysis Nominal Data Mode – In a frequency distribution, the mode is the score with the maximum frequency. Variation ratio – Proportion of scores not at the mode. Mode (multiple) – In a frequency distribution with multiple modes, the mode is the list of the scores with the maximum frequency. Variation ratio (multiple modes) – Proportion of scores not at the modes.

Ordinal Data Quartiles – A distribution may be divided into 4 quartiles, Q1, Q2, Q3, and Q4, where: Q1 is that point on a scale below which 25% of the cases fall & above which 75% of the cases fall Q2 is that point on a scale below which 50% of the cases fall & above which 50% of the cases fall Q3 is that point on a scale below which 75% of the cases fall & above which 25% of the cases fall Q4 is that point on a scale below which 100% of the cases fall & above which 0% of the cases fall Median = Q2 InterQuartile Range (IQR) = (Q3 – Q1) Semi-InterQuartileRange (SIQR) 0utlier (Extreme data point): (Q1 – k (Q3 – Q1) and (Q3 + k (Q3 – Q1) for some constant k

6/1/2013

Page 4/14

Parametric Analysis Interval and Ratio Data

Moments about the mean (Mn) 1.

1

N  M1 ( X  ) N 1

2.

N  2 M 2  ( X  ) N 1

3.

N X  )3 M 3  ( N 1

4.

N  4 M 4  ( X  ) N 1

6/1/2013

Page 5/14

Skew

Measure of Skew (g1):

Skew:

g1 

M3 (M 2 M 2 )

If g1 < 0, then skew is negative If g1 = 0, then skew is zero If g1 > 0, then skew is positive

Test of Significant Skew (g1):

Standard Error of Skew:

g  1

6 N

=0

6/1/2013

Page 6/14

Kurtosis Measure of Kurtosis (g2):

g2 

Kurtosos:

M4 3 2 M2

If g2 < 0, then kurtosis is platykurtic If g2 = 0, then kurtosis is mesokurtic If g2 > 0, then kurtosis is leptokurtic

Test of Significant Kurtosis (g2): Standard Error of Kurtosos:

g  2

6/1/2013

24 N

Page 7/14

Correlation Coefficients Pearson Product Moment Correlation (r) Spearman Rank Order Correlation (rs) Phi Correlation Coefficient (Φ)

-1.0 ≤ r ≤ +1.0 -1.0 ≤ rs ≤ +1.0 -1.0 ≤ Φ ≤ +1.0

Spearman rank Order Correlation (rs)- If both variables (X, Y) are ordinal and if both X & Y are ranked, then r = rs.

Phi Coefficient - If both variables (X, Y) are nominal and dichotomous then r = Φ

Scatter Plots for various correlations:

Positive Correlation: Low  Low High  High Negative Correlation: Low  High High  Low Zero Correlation: Low  Low Low  High High  High

High  Low

Common Misconception: Correlation and causality Correlation between two variables which are being observed (i.e, not under experimental control) does not usually permit one to infer a causal relationship between the variables, “Correlation does not imply causation”.

6/1/2013

Page 8/14

Common Misconception: Correlation and linearity The Pearson correlation indicates the strength of a linear relationship between two variables, but its value alone may not be sufficient to evaluate this relationship since there may be a nonlinear relationship between the variables.

Four sets of data with the same correlation of 0.816 The image above shows scatterplots of Anscombe's quartet, a set of four different pairs of variables originally created by Francis Anscombe. The four Xi,Yi variables have  The same X mean (9.0), standard deviation (3.32) and Y mean (7.5) standard deviation (2.03),  The same correlation between X & Y (0.816), and the same regression line (y = 3 + 0.5x). However, as can be seen on the plots, the distribution of the variables is very different. 1. The top left seems to be distributed normally, and corresponds to what one would expect when considering two variables correlated and following the assumption of normality. 2. The top right is not distributed normally; while an obvious relationship between the two variables can be observed, it is not linear, but rather curvilinear.. 3. In the bottom left the linear relationship is perfect, except for one outlier which exerts enough influence to lower the correlation coefficient from 1 to 0.81. 4. Finally, the bottom right shows another example when one outlier is enough to produce a high correlation coefficient, even though the relationship between the two variables is not linear.

Statistical Choices for Bivariate Correlation by Data Level Instructions: To determine the appropriate bivariate correlation as the descriptive statistic and its corresponding inferential test, select the data level (row selection) of the dependent variable for the X variable type and the data level (column selection) of the dependent variable for the Y variable type Correlation Ordinal Nominal a Inferential Test Correlation Phi Nominal Inferential Chi-Square Test Statistic Spearman Rank r Ordinal Correlation NA Inferential t-test of Spearman Test Statistic r Correlation rpb Spearman Rank r Interval Inferential t-test of rpb t-test of Spearman r Test Statistic Correlation rpb Spearman Rank r Ratio Inferential t-test of rpb t-test of Spearman Test Statistic r a Binomial classification with two categories Scale X / Scale Y

6/1/2013

Interval

Ratio

Pearson r t-test of Pearson r Pearson r t-test of Pearson r

Pearson r t-test of Pearson r

Page 9/14

Transformations Monotonic and Asymmetric Transformations One to one transformations are ones in which values for subclasses may be interchanged in a consistent manner. Monotonic transformations are ones in which values for subclasses may be exchanged as long as the order of the subclasses maintains the same ranking. Asymmetric transformations are ones in which values for subclasses may be exchanged wherein the order of the subclasses do not maintain the same ranking or order, e.g., subclasses are combined.

6/1/2013

Page 10/14

Distributions & Models Sample: Given some simple experiment and the sample space of the elementary events, the set of N outcomes of the separate trials is a sample. Random Sample: A sample drawn such that each and every distinct sample of the same size N has exactly the sample probability of being selected. Sample Distribution: An empirical frequency distribution, which describes the relative frequency associated with each of the various measurement classes observed for a given set of data based on a randomly selected subset of the population. Population Distribution: A theoretical frequency distribution, which describes the relative frequency, associated with each of the various measurement classes into which an entire set of possible observations (mutually exclusive and exhaustive events) can be stated. Independent and Dependent Variable: In an experimental study the independent variable is usually the variable being changed or manipulated while the dependent variable is the variable which is observed and measured as a result of the variation in the independent variable, i.e., the dependent variable is a function of the independent variable. In quasi-experimental studies or nonexperimental studies in which variables are not manipulated but rather are observed and measured, often in a natural setting, the relationship among the variables is explored. Parametric Statistical Models: Assumptions are made about the parameters of the population from which the research sample is drawn. Assume scores have at least an interval scale of measurement. Nonparametric Statistical Models: No (or very few) assumptions about population parameters. Depending upon the model, all scales can be used.

6/1/2013

Page 11/14

Point Estimation Facts: Facts or summary statistics reflect knowledge of a sample or a population. 1. Statistic -a fact or summary statistic about a sample (e.g., mean of sample). 2. Parameter -a fact or summary statistic about a population (e.g., mean of population). Point Estimation: Estimating a parameter (θ) with a single value of a statistic (G), e.g., estimating the mean of the population (µ) from the mean of the sample( X ). 1. Unbiased estimate: An estimate of a population parameter is unbiased if the mean of the distribution of the statistic (G) is equal to the value of the parameter (θ) which it estimates. E(G) = (θ). 2. Biased estimate: An estimate of a population parameter is biased if the mean of the distribution of the statistic (G) is not equal to the value of the parameter (G) which it estimates: E(G)  (θ). 3. Consistent estimate: An estimate of a population parameter is said to be consistent if the value of the statistic (G) more nearly approaches nearer the population value (G) as the sample size increases. 4. Efficient estimate: An estimate of a population parameter is said to be relatively more efficient as the standard deviation of the statistic (standard error) decreases. Interval Estimation: Estimating a parameter (θ) with a lower (Gl) and upper (Gu), value of a statistic (G) that is inferred with varying levels of confidence to include the value of the parameter. Thus, the wider the interval of estimation between the lower (Gl), and upper (Gu) values of a statistic, then the more likely that the interval of estimate includes the value of the parameter which is being estimated. Random Sample Statistic

S2 S r mdn siqr mode vr rs Φ

μ σ2 σ ρ Mdn SIQR Mode VR ρs Φpop

g1

G1

g2

G2

X

6/1/2013

Estimated Population Parameter

Page 12/14

Appendix: Windows Standard Key Combos Windows clipboard & clipbook with viewer    

Ctrl+Z Ctrl+X Ctrl+C Ctrl+V

- Undo last action - Cut selected text to clipboard - Copy x - Paste

Selecting Text  Ctrl+A - Select All  

Ctrl+Shift+left arrow key - select text 1 word left Ctrl+Shift+right arrow key - select text 1 word right

 

Ctrl+Shift+up arrow key - select text 1 line up Ctrl+Shift+down arrow key - select text 1 line down

 

Ctrl+Shift+Home key - select text current to top of document Ctrl+Shift+End key - selected text current to end of document

Find, Find Next & Replace  Ctrl+F - Find  F3 - Find Next  Ctrl+H - Find & Replace

6/1/2013

Page 13/14

References Gravetter, F. J. & Wallnau, L. B. . "Essentials of Statistics for the Behavioral Sciences". (6th Edition); New York:Wadsworth/Thomson Learning. ISBN 0495383945 Semi-Interquartile range graphics: http://cnx.org/content/col10522/1.35 Skew graphics: http://www.ehow.com/about_4793229_what-is-skew.html Courtesy of eHow.com - Clear Instructions on How to Do (just about) Everything. http://www.eHow.com Kurtosis graphics: Compassionate statistics: applied quantitative analysis for social services : with exercises and instructions in SPSS By Vincent E. Faherty Published by SAGE, 2007 ISBN 1412939828,9781412939829 Graphics with Correlation: http://allpsych.com/researchmethods/images/correlations.gif Common Misconceptions in correlation: http://en.wikipedia.org/wiki/Correlation Graphics with Normal Curve: Standard deviation diagram, based an original graph by Jeremy Kemp, in 2005-02-09 [http://pbeirne.com/Programming/gaussian.ps]. http://en.wikipedia.org/wiki/Normal_distribution Seeing Statistics by Gary H. McClelland: Probabilities for the Normal Distribution: http://psych.colorado.edu/~mcclella/java/normal/handleNormal.html Graphics for Sampling distribution of means: Howell, David C., Fundamental Statistics for the Behavioral Sciences, 6th Edition; Wadsworth: Cengage Learning ISBN-10: 049509900ISBN-10: 0495099007 ISBN-13: 9780495099000 Graphics Hotspot for Sampling distribution of means: Online Statistics: An Interactive Multimedia Course of Study (http://onlinestatbook.com/) http://onlinestatbook.com/simulations/CLT/clt.html

6/1/2013

Page 14/14...


Similar Free PDFs