Covariance & Correlation PDF

Title Covariance & Correlation
Author Denise Curcio
Course Psychological Statistics I
Institution Long Island University
Pages 8
File Size 523.1 KB
File Type PDF
Total Downloads 24
Total Views 156

Summary

Notes from doctorate level Psychological Statistics course. Lecture on covariance correlations. Discusses relationship between variables and beginning of correlations. ...


Description

Covariance & Correlation Tuesday, March 19, 2019

3:13 PM

Measuring Relations Between Variables • Quantifying a relationship between to variables - we try to quantify it into # • We can see whether as one variable deviates from its own mean, the other deviates in the same way, the opposite way, or stays the same • This can be done by calculating the covariance • If there is a similar (or opposite) pattern, we say they the two variables “covary” ○ Going back to what variance is - number deviating from the mean ○ When they deviate in a similar way they covary

Review of Variance • The variance is a measure of how much a group of scores deviate from the mean of a single variable ○ How much a group of scores deviates from the mean ○ The variance is the average squared deviation from the mean • Covariance is similar –it tells is by how much a group of scores on two variables differ from their respective means ○ Two groups of score how to they vary with two means ○ Does that pattern of how they vary match?

Covariance

• Look at the pattern • For people above the mean on ads watch are also above the mean on sugar packets bought - below the mean on ads also below on packets • Put this visual into a number • Does the pattern of deviation for these 5 scores for X match the pattern of deviation for the 5 scores on Y • To quantify it what do we do? ○ We can't add all up - it will be 0 ○ We will multiple each persons deviation from the mean ○ Take their deviation on one variable and multiply it by another variable ○ Each person will have a cross-product deviation ○ Then we add up the cross-product deviations and divide by N ○ This is covariance • Covariance = the average cross-product deviation Covariance Steps•Calculate the deviation (error) between the mean and each subject’s score for the first variable (x)•Calculate the deviation (error) between the mean and their score for the second variable (y)•Multiply these deviations (error) values. These multiplied numbers are called cross product deviations•Add up these cross product deviations•Divide by N-1. The result is the covarianceThe covariance is the average cross-product deviation

Equation for the Covariance

• Positive tells us peoples deviations are in the same direction • Negative tells us the patterns are reversed ○ E.g. how calm they are with tantrums

Example of How to Compute the Covariance

• If we were measuring height and changed it to inches - the covariance would be bigger • It is completely dependent on your unit of measurement • It gets bigger but doesn’t mean the relationship is stronger • It needs to be standardized so that you can tell what the number means • A correlation is a standardized covariance Limitations of Covariance•It depends upon the units of measurement–e.g. the covariance of two variables measured in miles and dollars would be much smaller than if the variables were measured in feet and cents, even if the relation was exactly the same.•Solution: standardize it!–Divide by the standard deviations of both variables•The standardized version of Covariance is known as the Correlation Coefficient–It is relatively unaffected by units of

measurement

The Correlation Coefficient

• Divide by SD of x and SD of y putting it on a scale of -1 to 1 • Pearson Product (computing the product of 2 z scores to standardize it) Moment (another word for mean or average) Correlation Coefficient • Ρ (Rho) is the correlation of a population • r is the correlation of a sample • Calculating r to estimate Ro Example from before:

What does p = .06 mean? • Assumption that Rho is 0 • When the null is true, rho is 0, 6% of the time we will get .87 or a more extreme result • There is a 6% chance that you will get a correlation coefficient of .87 or more higher when Rho is 0 • If we square .87 that would tell us the percentage of variability accounted for by one of those variances

• SPSS doesn't give a comprehensive report, just the r value • We want to tell the reader how precise it is • Creating a 95% CI would show how precise it is

Steps 1. Transform correlation into a z-score ○ Because a correlation goes from -1 to 1 it doesn't follow the normal curve § It is standardized § We know z-scores follow a normal curve ○ Use fisher's z-score transformation 2. Calculate standard error of the z-score ○ We need to somehow factor in the sample size ○ The correlation tells you nothing about the sample size ○ Remember we are more confident when we have a bigger sample CI will be smaller ○ Take N and add it into the formula 3. Calculate the CI for that z-score ○ Take the raw numbers and multiply them by 1.96 above/below the mean ○ Providing your best guess and providing the reader with your precision 4. Translate the upper and lower bounds of that z-score back into correlations ○ Using the same table to turn it back

Fisher's z-score transformation of Pearson's r

Lets look up the z-score

Next Step • In our example we had 5 subjects and we calculated the correlation between advertisements and sugar consumption to be .87. Using the table above, I get a z score of 1.33. • We then divide the z-score of the correlation (1.33) , which is denoted as zrby the Standard Error of zr

r

• To compute the SE of z, use this formula:



• SEzrequals.71 • Using the formula z = zr/SEzzr, I get 1.33/.71 = 1.83 Is it statistically significant?

• There was a small sample size so even though it’s a large correlation, it isn't significant 95% CI is 1.96 SD from the mean Add it to the best estimate giving us the u

95% C

fid

I

l A

d C

l i

95% Confidence Intervals Around a Correlation We calculated the z-score conversion of the correlation to be 1.33 and the SEzr to be .71 The z-score cutoff for a 95% confidence interval (z.95) is 1.96. The confidence interval is therefore computed as: Lower limit = 1.33 + (1.96)(0.71) = 2.72 Upper limit = 1.33 -(1.96)(0.71) = -0.06 The r associated with a z of -0.06 is -0.06 and the r associated with a z-score 2.72 is .99. Therefore, we can be 95% confident that the true population correlation (ρ) is captured by the numbers -0.06 and .99 *We are taking the correlation, making it into a z-score, making the CI a zscore, then turning it back to a correlation*...


Similar Free PDFs