STAT1201 Online Learning Notes PDF

Title STAT1201 Online Learning Notes
Course Analysis Of Scientific Data
Institution University of Queensland
Pages 29
File Size 1.4 MB
File Type PDF
Total Downloads 504
Total Views 1,062

Summary

Module 1: VariabilityError Variability:- Encompasses natural variability (variation as dictated by nature e. height) - and measurement variability (variability based on measurement techniques and discrepancies e. measurement error = variability in data) - Presence of Error variability makes it neces...


Description

STAT1201

Online Learning Notes

Semester 2, 2020.

Module 1.1: Variability Error Variability: -

Encompasses natural variability (variation as dictated by nature e.g. height) and measurement variability (variability based on measurement techniques and discrepancies e.g. measurement error = variability in data) Presence of Error variability makes it necessary to replicate our experiments

Group Variability: -

Variability caused by differences between groups (e.g. height differs between genders) A standard statistical approach to seeing if there is a difference between the groups is to see if the group variability is larger than the error variability

Sampling Variability: -

Sampling variability is how much an estimate varies between samples i.e. One group of males may have different heights than a second group of males

What is a variable? -

A variable is a characteristic that we can record about the subjects or objects in a study. These can be measurements we make, like a forearm length or blood pressure, or can be attributes, like sex or age. It is important to classify variables as being either quantitative or categorical. These two types will require different tools for exploration and analysis.

Variable Types: Quantitative: -

Quantitative variables - represent measurements, such as the height of a person or the temperature of an environment. These can be: Continuous variables - taking any value over some range. Continuous variables capture the idea that measurements can always be made more precisely. Discrete variables - have only a small number of possibilities, such as a count of some outcomes or an age measured in whole years.

Categorical: -

-

Categorical variables - represent groups of objects with a particular characteristic. For example, recording the sex of subjects is essentially the same as making a group of males and a group of females. Variables like sex are called Nominal variables because they are arbitrary categories with no order between them. Ordinal variables - are those whose categories do have an order. A common example of this is in recording the age group someone falls into. We can put these groups in order because we can put ages in order.

In this section we learned that: -

The need for data analysis comes from the variability present in data. Separating the differences between groups from background variability is a fundamental task of statistical analysis.

STAT1201 -

Online Learning Notes

Semester 2, 2020.

It is important to be able to identify the types of variables recorded in a study. Data from quantitative and categorical variables will be described and analysed in different ways.

Module 1.2: Designing studies Comparative Study: -

A comparative study splits its subjects into two groups, one for the control and one for the treatment Randomization is an effective way to reduce bias in measuring the effect of the treatment (this randomly assigns participants to either group).

Figure 1: Structure of a Comparative study

Block Design: -

With a randomized block design, the experimenter divides subjects into subgroups called blocks, such that the variability within blocks is less than the variability between blocks. Then, subjects within each block are randomly assigned to treatment conditions

Figure 2: Structure of a Block Design

Randomization: -

A method based on chance alone by which study participants are assigned to a treatment group. Randomization minimizes the differences among groups by equally distributing people with particular characteristics among all the trial arms. This randomization is an important step in comparative experiments. It means that on average the two groups here will have the same proportions of the experiment variables (i.e. height, gender etc)

Observational Studies: -

An observational study has the same aims as an experiment but is passive, often working with existing data such as medical records

STAT1201

Online Learning Notes

Semester 2, 2020.

P-Value: -

A small P-Value suggests the Null Hypothesis (H0) may be wrong; instead giving evidence for the Alternative Hypothesis (H1). A large P-Value suggests that the data is consistent with the Null Hypothesis.

Null Hypothesis (H0): -

The null hypothesis will usually be a statement of “no effect”. The null hypothesis is usually denoted H0 when discussing the theory of hypothesis testing, but you will rarely find this notation appearing in scientific papers that use hypothesis tests. In fact, it is rare for authors to specify the null hypothesis at all, though it is usually easy to infer what it was, based on the statement of results.

Alternative Hypothesis (H1): -

Related to the null hypothesis is the alternative hypothesis. Denoted by H1, this is usually what we want to show, and it gives us the direction by which we judge a possible outcome to be as unusual as the one actually observed

Decisions: -

A traditional use of hypothesis testing has been as a tool for decision making. To do this a threshold is chosen, such as 0.05, and if we find a P-value which is less than 0.05 then we say that “the results were significant at the 5% level”. This will also often be written in journal articles as “the results were found to be significant (pp2. Using the z statistic gives

corresponding to a P-value close to 0. This is very strong evidence in favour of H1, suggesting that the nicotine inhalers have increased the probability of sustaining a reduction in smoking. We can calculate a 95% confidence interval for the difference between the proportions as

Thus, we are 95% confident that the underlying difference is between about 10% and 24%, again suggesting that the presence of nicotine in the inhaler has an effect. Summary: In this section we have learned that: -

The Normal approximation to the Binomial distribution can be used to calculate confidence intervals and hypothesis tests for a population proportion based on a sample proportion. This approach can be extended to a comparison of two population proportions.

Module 9.1: Analysis of Variance -

Sample Variance

-

The F statistic is given by:

Where,

STAT1201

-

Online Learning Notes

Semester 2, 2020.

Note MSG = mean sum of squares for the groups, MSR = mean sum of squares for the residuals. Since the mean sum of squares is just a variance, the F statistic is a ratio of variance Results of an Analysis of Variance (ANOVA) are traditionally stored in an ANOVA table...


Similar Free PDFs