Unit Outline PDF

Title Unit Outline
Author Anonymous User
Course Introduction to Statistics
Institution University of Sydney
Pages 4
File Size 133.3 KB
File Type PDF
Total Downloads 98
Total Views 156

Summary

Unit outline...


Description

STAT 5002 Introduction to Statistics – Semester 1, 2020 Unit Information Sheet

Websites: It is important that you check the STAT5002 Canvas website regularly, which may be accessed through the link below https://canvas.sydney.edu.au Announcements, such as assessment tasks, will be made on this page at various times throughout the semester. Unit lecturer: • Dr. Yves Tam ([email protected]) Lecture time and place: • Wed 18:00-21:00 in Carslaw Lecture Theatre 175 (tentative) Contact hours: • Held at ABS 4036 Wed 17-18pm (tentative) Lectures and Tutorials: • Classes run for 13 weeks. • You will need to bring your own laptop with R and RStudio installed for each lecture and tutorial. • Attendance at tutorials will be recorded. Please email [email protected] should you not be able to attend any of your classes. • The lecture and tutorial notes for a given week will be available on the STAT5002 Canvas page prior to each lecture/tutorial. In general, the tutorial material will be based on the lecture material of the same week. • You should bring a copy of the current weeks lecture and tutorial notes to your classes each week. • Most tutorial problems will require the use of the computer software R. Suggested reading: There are no required texts for this unit. The following text is recommended as a reference book. • All of Statistics, Larry Wasserman, Springer (2004) Further, the books and online resources listed below may prove helpful. • David Freedman, Robert Pisani and Roger Purves. Statistics. Norton, 2007. ISBN: 1

978-0-393-52210-5. • R Development Core Team: An Introduction to R. http://cran.r-project.org/doc/manuals/R-intro.pdf • Hadley Wickham, Advanced R, 2015. ISBN: 978-1-4665-8697-0. In addition, The Comprehensive R Archive Network site at http://cran.r-project.org/ contains lots of useful material. In particular, you can download your own free copy of the R binary, and browse through the FAQs (Frequently Asked Questions). Computing & data sets: The computer package R with the RStudio interface will be used during this course. R can be freely downloaded from the CRAN site given above and RStudio is available from https://www.rstudio.com/. You should have these installed in your device. All data sets used in lectures and tutorials will be made available from the Canvas course webpage. Where to go for help • For administrative matters, go to the Student Services Office, Carslaw Building Room 520 or email [email protected]. Ensure that any emails that you send to this address contain your name and SID otherwise it will be ignored. • For help with the statistics, see your lecturer, your tutor or use the Canvas discussion forum. Lecturers guarantee to be available during their indicated contact hours, but may be available at other times as well. Assessment: Your final raw mark for this unit will be calculated as follows: Method of Assessment

Tentative Timing

Online quizzes Mid-semester quiz Assignment 2h written examination

week 4, 8, 12 week 7 week 12 Final examination period

Weighting 12% 20% 8% 60%

High Distinction (HD), 85-100: representing complete or close to complete mastery of the material; Distinction (D), 75-84: representing excellence, but substantially less than complete mastery; Credit (CR), 65-74: representing a creditable performance that goes beyond routine knowledge and understanding, but less than excellence; Pass (P), 50-64: representing at least routine knowledge and understanding over a spectrum of topics and important ideas and concepts in the course. Objectives: This unit aims: • to introduce techniques for summarising experimental univariate and bivariate data, such as that obtained in various branches of science, medicine, commerce etc, by means of elementary statistics and diagrams; • to use probability theory to provide a mathematical framework for real life data modelling; • to introduce statistical inference and show how statistical tests can provide evidence for or against a scientific question; 2

• to introduce the fundamental concepts of analysis of data from both observational studies and experimental designs using classical linear methods; • to gain competency in the application and understanding of linear models and regression methods with diagnostics for checking appropriateness of models; and • to enhance proficiency in the use of the R binary to give analyses and graphical displays. Outcomes: Students who successfully complete this unit will be able to • use the R statistical computing environment to obtain numerical and graphical summaries of data, and for performing various statistical calculations; • explain univariate and bivariate data by means of the five number summary, mean, variance and standard deviation, correlation coefficient, boxplot, histogram and scatterplot; • use methods derived from the three axioms of probability to calculate the probabilities of simple events; • understand the concept of a random variable and the meaning of the expected value and variance; • apply the Binomial distribution as a model for discrete data; • use the Normal distribution as a model for continuous data; • understand the central limit theorem; • understand the concept of hypotheses tests and p-values for finding evidence for or against simple null hypotheses, in particular using the binomial test for testing proportions, oneor two-sided z-, t- or sign-test for making inference about the population mean; • use the Chi-squared test for simple goodness of fit problems; • understand the concept of a confidence interval; • understand the fundamental difference between frequentist based inference and bayesian inference • find the least squares regression line as a way of describing a linear relationship in bivariate data; • use R to analyse multivariate data; • use of the general F-test as the main tool to choose between two nested regression models; • assess model assumptions and outlier detection in regression models through standard diagnostic plots (box plot, scatterplot, Q-Q-plot, Cook’s distance plot, leverage vs residual plot), through influence measures (leverage values, Cook’s distance); • apply multiple linear regression and in the understanding of R2 and the adjusted R2; • calculate and interpret confidence intervals for all parameters in linear regression; • use model selection through using the F-test, t-test, AIC or BIC through full searches or by using step-wise procedures (backward, forward, stepwise); 3

• apply polynomial regression models; and • apply logistic and non-parametric regression. Intended topics outline: Week

Topics

1

Introduction to Statistics

2 3 4 5 6 7 8 10 11 12 13

Probability Random Variables Hypothesis Testing for a Mean Test for Means (Z and T Tests) Test for Goodness of fit (Chi-Squared Tests) Confidence Intervals Bivariate Data Multiple Linear Regression Model Selection Logistic Regression and Non-parametric Regression Bayesian Inference

last adjustments: February 12, 2020 by Yves Tam

4...


Similar Free PDFs