Cheat Sheet for Exam 2 - Summary CALC-BASED INTRO TO STATISTICS PDF

Title Cheat Sheet for Exam 2 - Summary CALC-BASED INTRO TO STATISTICS
Author melanie shi
Course CALC-BASED INTRO TO STATISTICS
Institution Columbia University in the City of New York
Pages 3
File Size 210.4 KB
File Type PDF
Total Downloads 2
Total Views 149

Summary

summary of all covered concepts before final exam (noncomprehensive)...


Description

Stat Exam 2 review Recall 

Formula for expected value

 

Formula for variance; use shortcut Formula for standard deviation s; use shortcut computational version

Distributions SUMMARY distribution

definition

binomial distribution

number of successes in trials (number of trials is fixed)

geometric distribution

number of failures before the first success (number of trials is not fixed)

negative binomial distribution

number of failures before the th success



Binomial distribution o Binomial experiment: consists of n trials in which trials can result in success or failure and there is a constant probability of success p

o Mean = n*p o Standard deviation = sqrt (np(1-p)) o Binomial random variable is the sum of

independent and identically distributed Bernoulli random variables



Bernoulli o The Bernoulli distribution is a discrete distribution having two possible outcomes labelled by and in which ("success") occurs with probability and ("failure") occurs with probability , where . o It therefore has probability density function

which can also be written

Probability plots 

Principle: You make an xy scatterplot where the x’s are the data points, sorted in ascending order, and the y’s are the expected z scores for a normal distribution. o

http://brownmath.com/ti83/normchek.htm

Chapter 5 Joint vs. marginal o Marginal = summed over the opposite variable  Discrete vs. continuous o pmf p (x, y) vs. pdf f (x, y) Discrete vs. continuous pmf/pdf Summation/integral Discrete pmf p (x, y) Summation Continuous pdf f (x, y) Integral 



If X and Y are independent, then we know that E(XY) = E(X)E(Y)

Central Limit Theorem  Rule of thumb is n > 50 because this safely covers the underlying distribution Chapter 6 A point estimate (say q e ) for a population parameter (say q) is a numerical quantity (or a statistic) obtained from a sample that can be considered as a “best guess” for q

For eg., if we have a have a sample that comes from a population that has the population parameters -mean m , and std. dev σ • The sample mean and the sample median would both be point estimates for m. • They are also referred to as estimators for m There can be several possible estimators for a q . How do we know which is the best estimator? There are a few aspects of the estimators to look for: • Of course, the best estimator qe would be one which equals q We define • (qe - q) = error (in estimation ) • (qe -q)2 =squared error, • E(qe -q)2 =mean square error (MSE). • Let us assume qe is an estimator of the population parameter q We can obtain the value of qe from different samples. We could then obtain the sampling distribution of qe (similar to the sampling distribution we talked about for

X

). If the sampling distribution of qe is always centered at the true value of the parameter , then we call q e an unbiased estimator of q • In other words, a point estimator qe is said to be an unbiased estimator of q if E(q e ) =q , for every possible value of q. 5) If qe is not an unbiased estimator of q , then the difference [ E(qe ) -q ] is called the bias of q Among all the unbiased estimators of the population parameter q, the estimator with the minimum variance is called the minimum variance unbiased estimator (MVUE) of q Let X1…Xn be random sample of size n from a normal distribution with the ie population parameters of mean m and (and say std.dev s ), then the estimator the sample mean is the minimum variance unbiased estimator (MVUE) of m . The best estimator for a parameter depends critically on the underlying distribution that is being sampled. • For example , the sample mean may not always the best estimator for the measure of centrality of a continuous distribution , due to it’s sensitivity to outlying observations ( a sample from the Cauchy distribution will frequently have a few of these, hence the sample mean is probably not a good estimator when the underlying distribution is Cauchy)...


Similar Free PDFs