Title | STAT1103 Stats Notes |
---|---|
Course | Introduction to Psychological Design and Statistics |
Institution | Macquarie University |
Pages | 18 |
File Size | 593.2 KB |
File Type | |
Total Downloads | 124 |
Total Views | 395 |
Statistics NotesLecture 3 - Hypothesis TestingThe hypothesis has two components: - Null hypothesis : no difference/change in results i., nothing (H 0 ) - Alternative hypothesis : difference/change in results i., something has happened (H 1 )Valid bases for forming hypotheses: - Intuition – based on ...
1
Statistics Notes
Lecture 3 - Hypothesis Testing The hypothesis has two components: -
Null hypothesis: no difference/change in results i.e., nothing (H0)
-
Alternative hypothesis: difference/change in results i.e., something has happened (H1)
Valid bases for forming hypotheses: -
Intuition – based on opinion, faith, belief, or feelings (common sense)
-
Authority – knowledge about behaviour from an expert or trustworthy source
-
Rational induction – based on the combination of facts
-
Empirical science – knowledge about behaviour tested and confirmed via the scientific method o
Only valid method for testing hypothesis
Categorical vs Numerical Methods: -
Empirical research methods (gathering numeric, measurable data)
-
Categorical (qualitative) – data gathered is descriptive (nominal or ordinal)
-
Numerical (quantitative) – data gathered is numeric, analysed using quantitative analytic methods (interval or ratio)
-
Can be mixed method – complementary approaches for complementary information
Categorical Nominal (numeric) data: -
The count or number (freq) in each category
-
Proportion or percentage in each category
-
Mode
For one variable, tabulate variable For multiple variables, tabulate variable1 variable2 Ordinal (graphical) data: -
Plot the number or percentage/proportion in each category using a bar chart o X-axis = categories o Y-axis = count/percentage/proportion
2 Frequency in y-axis = graph bar (count), over(variablename) Percentage in y-axis = graph bar (percent), over(variablename)
Lecture 4 – Measurement and the Central Limit Theorem Measurements – the data taken on subjects in a study according to the variables of interest Numerical and graphical summaries
One categorical variable: Numeric: -
Use frequency tables showing percentage or proportion in each category of a nominal or ordinal variable
-
tabulate variablename
Graphical: -
Bar charts / pie charts showing frequency / percentage of observations in each category
-
graph bar (count), over(variablename)
-
graph bar (percent), over(variablename)
-
graph pie, over(variablename)
3 One numerical variable: Numeric: -
Calculation of summary or descriptive statistics o Mean or median (for the centre) o SD or IQR (for spread) o Variance and range
-
tabstat variablename, statistics(mean, sd, range, median, iqr)
Graphical -
Histogram o histogram variablename
-
Boxplots o graph hbox variablename o median = central line o IQR = edges of the box o Range = lowest and highest points
Bivariate Summaries Most studies are interested in the relationship between two (or more) variables. -
Outcome of interest (hypothesis) = dependent variable (DV)
-
Independent variable (IV) used to predict outcome
Two categorical variables: Numerical: -
Tabulate variable1 variable2, row
-
Cross-tabulation of one categorical variable by another (contingency table)
Graphical: -
graph bar (percent), over(variable1) over(variable2)
-
graph bar (count), over(variable1) over(variable2)
-
plot the count / percentage of a categorical variable within each category of another categorical variable
4 Two numerical variables: Numeric: -
Pearson correlation – a numerical summary that describes the strength and direction of the linear relationship between two variables.
-
r = sample correlation
-
𝜌 = population correlation
Correlation formula: 𝑐𝑜𝑣(𝑥, 𝑦) 𝑆𝑥𝑆𝑦 -
Variability within each variable and variability between the two variables (covariance)
-
correlate variable1 variable2
Graphical: -
Scatterplot o Shows relation between two variables o Each point is an x-y pair from the same individual o Can show positive, negative relationship, no relationship or non-linear relationships
-
scatter variable1 variable2
One categorical and one numerical variable: Numerical: -
Calculate descriptive statistics for each category
-
Compare means, medians, variability (SD, variance, IQR, range) statistics
-
by categorical_variablename, sort: tabstat numeric_variablename, statistics(mean, sd, range, median, iqr)
Graphical: -
Comparative boxplots
-
Compare descriptive statistics + gaps, outliers.
-
graph box numeric_variablename, over(categorical_variablename)
5 Correlation
Pearson’s correlation coefficient: -
ranges from -1.00 to +1.00
-
stronger correlation further from 0 0 − ±0.10
Very weak-to-no relationship
±0.10 − 0 ±0.30
Weak relationship
±0.30 − 0 ±0.50
Moderate relationship
±0.50 − 0 ±1.00 -
Strong relationship
Correlation is both a test statistic and a measure of effect size
WHEN TO CALCULATE CORRELATION: -
Check aspects of scatterplot o Monotonic (does the trend keep in one direction) o Linear o Direction of association (positive or negative) o Gaps(?) o Outliers(?)
-
If data checks out o Comment on direction of association o Calculate and comment on correlation strength
-
y-axis = DV, x-axis = IV
Statistics and Parameters: Sample Statistics
à
Mean
x4 0(M)
à
Median
xd (Mdn)
à
Std dev
𝑠 (SD)
Variance
0
à
Population Parameters 𝜇
𝜇 -tilde
𝜎
𝑠²0000000000000000000 𝜎² 0 à00000000000000
6 Summary
Numerical summaries (and Stata code) DATA categorical
categorical
numerical
contingency tables
mean, median, SD,
frequency tables
(count, %)
IQR, variance by
(count, %)
group numerical
mean, median, SD, IQR, variance by
mean, median SD, correlation
IQR, variance
group frequency tables
mean, median SD,
(count, %)
IQR, variance
Graphical summaries DATA
categorical
numerical
categorical
clustered bar charts
comparative box
bar charts or pie
plots
charts
scatter plots
histograms
numerical
comparative box plots bar charts / pie
histograms
charts
Sampling distributions -
Hypothesis assumes sampling distribution of the mean is normal
-
Large enough sample size = normally distributed sample means regardless of population shape
-
This is the central limit theorem
-
Variability (called standard error = SD of sample mean) decreases as sample size
-
increases 𝑠 𝑆𝐸 = √𝑛
Used when making assumptions that relate to normality
7 Lecture 5 - One-Sample Tests The process: 1. Null and alternative hypothesis based on known or postulated beliefs about a population 2. Choose significance level for the test (𝛼, i.e., alpha). Usually 5% (𝛼 = 0.05) 3. Choose appropriate test statistic and use sample to check the assumptions for that test 4. Calculate test statistic using sample data 5. Calculate probability of having a test statistic as extreme or more extreme than the one you found (p-value) if the null hypothesis is true 6. Make verdict about null hypothesis: if the p-value ≤ 𝛼 we reject H0, if the p-value is > 𝛼 fail to reject H0 7. Write a conclusion about your research question
Test statistics are the standardised difference between the null value and the sample statistic. We standardise the difference by dividing it by the standard error. M = mean of sample 𝜇 = 𝑚𝑒𝑎𝑛
𝜎 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑0𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 z-score: 𝑧 =
!"#$%&$'(#)!%$*(+ ,
,
standard error (SE) = √. n = sample 𝜒2 = chi-squared Σ = sum
One-sample z-test for a mean: USED WHEN: -
Data is numeric
-
Sample is drawn from a normal distribution (graph sample data if not given)
-
Observations are independent
-
Must be given 𝜎
All assumptions must be met
8
𝑧=
𝑀 − 0𝜇 𝜎 √𝑛
ztest variablename == mu0, sd(sigma)
z close to 0 = big probability, z far from 0 = small probability small p-value = very unlikely = assume H1
One-sample t-test for a mean: USED WHEN: -
We don’t know 𝜎 o We estimate it using the sample standard deviation (s). o When 𝜎 is unknown, t-test statistic with v = n – 1 degree of freedom (df)
-
Scores are numeric
-
Scores are approximately normally distributed, not too skewed
-
Observations are independent 𝑀 − 0𝜇 𝑠 √𝑛
𝑡=
ttest variablename == mu0
We use a one-sample t-test for a mean when we DO NOT know 𝜎 , the population standard deviation.
APA format: A one-sample t-test was conducted to determine . Results indicate that (M, SD), t (test statistic – df) = x, (no 0 before decimal point).
Chi-squared goodness of fit test USED WHEN: -
Scores are categorical
9 o We want to ask questions about the number or proportion in each category of that categorical variable §
E.g., are 50% of participants female
§
Are the proportions of each category equal?
§
Does sample match with relevant statistics
o Asking whether proportions of observations across categories is different to some known (expected) proportions -
Observations are independent
-
Expected frequency in each category is at least 5
In Chi-squared goodness of fit tests we hypothesise about the proportions in each category (𝑂𝑖 − 𝐸𝑖)𝑠𝑞𝑢𝑎𝑟𝑒𝑑 𝜒2 = 0Σ 𝐸𝑖
Where Oi = observed value for category i Ei = expected value for category i Square the difference between each observed and expected value and divide by the expected value To find p-value: Display chi2tail (df, test_statistic) Stata has no inbuilt command for performing the whole test, so do this instead: findit csgof -> click link to install package
then type csgof categorical_variablename, expperc (perc1, perc2, etc.) replacing perc1, perc2, etc. with the expected (hypothesised) percentage in each category
APA format: A chi-squared goodness-of-fit test was conducted to determine whether . There was/was no evidence that , 𝑥2(df, sample size) = , .
Lecture 6 – Non-experimental Data -
Arise when the researcher simply observes the subject or object under investigation, they don’t apply any intervention
10 -
Looking for associations / relationships between a dependent and independent variable(s)
-
If found, we can only describe associations, we cannot know for sure that there is a causative effect
Shapiro-Wilk test for normality USED WHEN: -
Population from which the sample is drawn is normally distributed
-
Numeric variables only
-
Used to see if the assumption of normality is met, IN CONJUNCTION with a graph and/or numeric descriptive statistics
-
H0 = normally distributed
-
H1 = not normally distributed
-
We don’t want a significant result
In Stata: Graph form: histogram variablename Table form: swilk variablename p-value = Prob>z category
Two-sample t-tests of means USED WHEN: -
Comparing two independent groups
-
DV is numeric
-
DV is normally distributed in both groups (use histograms and Shapiro-Wilk)
-
Approximately equal variance between the two groups (check with Levene’s test)
-
Observations are independent (within and between groups)
Hypothesis: H0:𝜇1 = 0𝜇2vs H1: 𝜇1 ≠ 𝜇2
Use 𝛼 = 5% significance level
11 Test statistic is: 𝑥−𝑦 𝑡= 1 1 𝑆𝑝Q 𝑛1 + 𝑛2 Where x and y = sample means of DV in groups 1 and 2 n1 and n2 = sample sizes in groups 1 and 2 Sp = pooled standard deviation
𝑆𝑝 = √
𝑎(𝑛1 − 1) + 𝑏(𝑛2 − 1) 𝑛1 + 𝑛2 − 2
Where a and b = sample variances in groups 1 and 2 Step 1: check equality of variability (swilk) by categorical_variable1, sort: summarize variable 2
Step 2: check normality of each group histogram variablename, by(categorical_variablename) freq
Step 3: test equality of variances (Levene’s test) robvar variablename, by (categorical_variablename)
Simple method using Stata: ttest variablename, by(categorical_variablename)
If p-value ≤ 0.05, we reject the null hypotheses. APA format: A two-sample t-test for the difference in mean between and indicated that .
Significance levels: Why 5%? Used to minimise chance of Type I error, which happens if: -
We reject the null hypothesis when we shouldn’t (false positive result)
-
Accept the alternative hypothesis when we shouldn’t
-
When sample effect is due to chance
12 Type II error: -
Not rejecting null hypothesis when we should (false negative result)
-
Rejecting alternative hypothesis when we shouldn’t
-
Sample isn’t detecting a real population effect
Power -
Probability that we correctly reject the null hypothesis
-
Influenced by: o Significance level – as 𝛼 increases power also increases
o Sample size – as 𝑛 increases power increases, because the SE decreases o Variability in the DV – more variable = harder to reject H0 o Magnitude of difference between hypothesised and true values – small difference = less power (harder to reject H0), big difference = more power (easier to reject H0)
Lecture 7 – Experimental Data
Background -
Arise when the researcher applies an intervention to the subject or object under investigation
-
Aim to find causal relationships between a dependent and independent variable(s)
-
More validity than observational studies, controlled environment prevents chance correlations
Two-sample t-test of means We test H0: 𝜇1 = 𝜇2 versus H1: 0𝜇1 ≠𝜇2
Significance level 𝛼 = 5% Steps to complete:
1. Check for normality (Shapiro-Wilk) by categorical_variablename, sort: swilk numeric_variablename
2. Check for equality of variances (Levene’s test) robvar numeric_variablename, by(categorical_variablename)
13 3. Find test statistic and p-value ttest numeric_variablename, by(categorical_variablename)
If p-value ≤ 0.05, we reject null hypothesis. APA format: Researchers concluded that .
Experimental studies allow us to make firmer conclusions and recommendations
Paired t-test Test conducted on participants of the same group, rather than two separate groups. -
All subjects receive both conditions but in random order OR matched pairs (couples, twins, age, sex etc.) where random member of pair receives intervention
We test H0: 𝜇𝑑 = 0 versus H1: 0𝜇𝑑 ≠0
𝜇𝑑 = mean of differences in the population Assumptions made: -
Differences are numeric
-
Sample of differences are approximately normally distributed
-
Differences are independent
t-test statistic: /'*(+'
𝑡=
!" √$"
with degrees of freedom 𝑑𝑓 = 𝑛𝑑 − 1
In Stata: ttest variable1 == variable2
-
Provides mean, SD, t-test statistic, df and p-value
Paired t-tests vs one-sample t-tests -
Same formula and degrees of freedom
-
Only difference is that we know the differences came from paired observations
14 -
Paired t-test shows scores of both variables, one-sample t-test shows difference in scores
Confidence intervals for means
Interval estimate gives us a range of believable values for the parameter. We call the interval / range of believable values a confidence interval Samples from the same population tend to have differing results due to sample size. -
95% for STAT1103
-
Means that if multiple samples are taken from a population, about 95% will contain the true parameter value (𝜇 )
-
Confidence level and significance level ( 𝛼) are inverse of each other
-
𝑆𝑎𝑚𝑝𝑙𝑒0𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 ± 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙0𝑣𝑎𝑙𝑢𝑒 × 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑0𝑒𝑟𝑟𝑜𝑟0
-
Critical value = value that cuts off 5% in both tails of the relevant distribution (z or t)
We test H0: 𝜇 = M versus H1: 0𝜇 ≠M Significance level 𝛼 = 5%
Steps to complete (if 𝜎 is known): 1. One-sample z-test 𝑧=
/*+ % √$
2. Rearrange to create interval formula ,
𝑀 ± 𝑧𝑐𝑟𝑖𝑡 × √. In Stata: ztest variable==M, sd( 𝜎) Steps to complete (if 𝜎 is not known): 1. One-sample t-test 𝑡=
/*+ & √$
2. Rearrange to create interval formula 𝑀 ± 𝑡𝑐𝑟𝑖𝑡 ×
#
√.
15 In Stata: ttest variable == M
Steps to complete for paired t-test 𝑀𝑑 ± 𝑡𝑐𝑟𝑖𝑡 ×
#' √.'
In Stata: ttest variable1 == variable2 p-value not significant = interval includes 𝜇 Steps to complete for two-sample t-tests 𝑡=
01*02
#3 4 5 $ $ '
'
'
(
with df = 𝑛1 + 𝑛2 − 2
Formula rearranged to give a 95% confidence interval for 1
𝜇1 − 𝜇2: (𝑥1 − 𝑥2) ± 𝑡𝑐𝑟𝑖𝑡 𝑠𝑝Q + . × 1
1 .2
Significant p-value = interval does not include 0 ( 𝜇1 − 𝜇2 = 0) Correlation test pwcorr variable1 variable2, sig
gives us correlation and p-value. Interval would help see variability in estimated correlation. ssc install ci2 ci2 variable1 variable2, corr
Summary Hypothesis test One-sample z-test One-sample t-test
Formula
Confidence interval
𝑧=
𝑀 ± 𝑧𝑐𝑟𝑖𝑡 ×
√.
𝑡=
/*+
𝑀 ± 𝑡𝑐𝑟𝑖𝑡 ×
√.
% √$
& √$
Paired t-test
𝑡𝑑 =
/' *+'
Two-sample
𝑡=
01*02
t-test Correlation
&" √$"
#34 5 $' $(
𝑡=
'
'
%√.*2 √1*%2
Df
,