QTM Lecture HW 8 - homework questions and answers PDF

Title QTM Lecture HW 8 - homework questions and answers
Author Zoe Schreiber
Course Intro to Stat Inference
Institution Emory University
Pages 7
File Size 156.7 KB
File Type PDF
Total Downloads 26
Total Views 134

Summary

homework questions and answers...


Description

CH4.6: 30 (Cautions in hypothesis testing) CH5.6: 18,20,22 (Inference for paired data) 4.30 Testing for food safety. A food safety inspector is called upon to investigate a restaurant with a few customer reports of poor sanitation practices. The food safety inspector uses a hypothesis testing framework to evaluate whether regulations are not being met. If he decides the restaurant is in gross violation, its license to serve food will be revoked. (a) Write the hypotheses in words. Ho = regulations are met — the true mean of the number of violations is within some range Ha = regulations are not met — the true mean of the number of violations is below (b) What is a Type 1 Error in this context? A Type I error occurs when H0 is true in reality but is rejected based on evidence from the test. — the inspector says the restaurant is not meeting the regulations when really they are (c) What is a Type 2 Error in this context? A Type II error occurs when H0 is false in reality but is not rejected based on evidence from the test. — the inspector says the restaurant is meeting the regulations when really they are not (d) Which error is more problematic for the restaurant owner? Why? Type I Error - bc that means they failed when they should've passed (e) Which error is more problematic for the diners? Why?

Type II Error - bc then they think they're eating at a restaurant that passed when really it failed (f) As a diner, would you prefer that the food safety inspector requires strong evidence or very strong evidence of health concerns before revoking a restaurant’s license? Explain your reasoning. strong evidence Thi smak esi tmor el i k el yt oc aus et her es t aur antt of ai lt hei ns pect i ons oi twi l l pus ht he r es t aur antownert oi nc r eas et hes ani t at i ons t andar ds .Howev er ,byl ower i ngt het hr es hol dt hepot ent i alofmak i ngt y pe1er r oral s oi ncr eas es ,whi c hwoul dbeofaconc er nas wel l .

5.18 Paired or not, Part II? In each of the following scenarios, determine if the data are paired. (a) We would like to know if Intel’s stock and Southwest Airlines’ stock have similar rates of return. To find out, we take a random sample of 50 days, and record Intel’s and Southwest’s stock on those same days. not paired These data are not paired because we sampled different days. If we recorded the stock prices on the same day, then they would be paired because on the same day the stock prices may be dependent on external factors that affect the price of both stocks. (b) We randomly sample 50 items from Target stores and note the price for each. Then we visit Walmart and collect the price for each of those same 50 items. dependent bc they're sampling the same items — paired (c) A school board would like to determine whether there is a difference in average SAT scores for students at one high school versus another high school in the district. To check, they take a simple random sample of 100 students from each high school. not paired

These data are not paired because the individual students are not matched - the two groups of student represent to independent random samples. 5.20 High School and Beyond, Part I. The National Center of Education Statistics conducted a survey of high school seniors, collecting test data on reading, writing, and several other subjects. Here we examine a simple random sample of 200 students from this survey. Side-by-side box plots of reading and writing scores as well as a histogram of the differences in scores are shown below. (a) Is there a clear difference in the average reading and writing scores? no (b) Are the reading and writing scores of each student independent of each other? No. Although they are two different scores, they are intelligence measurements on the same individual and therefore are dependent. (c) Create hypotheses appropriate for the following research question: is there an evident difference in the average scores of students in the reading and writing exam?

(d) Check the conditions required to complete this test. (i) The high school seniors were randomly selected, so there is no concern for bias. (ii) While the reading and writing scores are dependent, the seniors themselves are independent. (iii) The distribution of the differences in scores is approximately bellshaped and the sample size is large (n=200), so the sampling distribution of the sample mean difference is approximately normal and conditions are satisfied for valid inference.

(e) The average observed difference in scores is x :read−write = −0.545, and the standard deviation of the differences is 8.887 points. Do these data provide convincing evidence of a difference between the average scores on the two exams? μ1 –μ2 = x :read−write = −0.545 sd = 8.887 t = x bar - mu d / se = −0.545 - 0 / .627 = -.869 se = sd/ sq rt of n = 8.887 / sq rt of 200 = .627 p−value = 0.1949×2=0.3898— greater than 0.5 — fail to reject

The hypotheses for the average difference test are: (H_0: \mu_{diff} = 0) (H_a: \mu_{diff} \ne 0) The paired data is presumably from less than 10% of the population of high schoolers, and from a simple random sample. We've already see the differences are nearly normally distributed, so the conditions are met to apply the t-distribution.

The p-value is not less that 0.05, therefore I conclude that there is not convicing evidence of a difference in student's reading and writing exam scores. On the otherhand, the question of a difference between average exam scores does not seem to be addressed by the data provided. I would need (\bar{x}{read}-\bar{x} {write}) and the corresponding standard deviation to proceed with that test, and the data would have to be not paired.

(f) What type of error might we have made? Explain what the error means in the context of the application.

A Type I error is when we incorrectly reject the null hypotheis, while Type II is when we incorrectly reject the alternative hypothesis. In the case above, we may have made a type II error with incorrectly rejecting the alternative hypothesis. In other words, we might have wrongly concluded that there is not a difference in student reading and writing exam scores. •ATypeI Ier r orocc ur swhenH0i sf al sei nr eal i t ybuti snotr ej ect edbas edonevi denc ef r om t het es t .

(g) Based on the results of this hypothesis test, would you expect a confidence interval for the average difference between the reading and writing scores to include 0? Explain your reasoning.

Yes, I would expect a confidence interval for the average difference between reading and writing scores to include 0. When the confidence interval spans 0 for this kind of hypothesis test, it indicates that the difference is not clearly on one side or the other of zero and therefore results is a failure to reject the null hypothesis of no difference.

5.22 High school and beyond, Part II. We considered the differences between the reading and writing scores of a random sample of 200 students who took the High School and Beyond Survey in Exercise 5.20. The mean and standard deviation of the differences are x :read−write = −0.545 and 8.887 points. (a) Calculate a 95% confidence interval for the average difference between the reading and writing scores of all students. x :read−write = −0.545 t = -.869 se = .627 df = n-1 = 199

x bar + t x se = −0.545 + 1.96 x .627 x bar - t x se = −0.545 1.96 x .627 (−1.775, 0.685) Response feedback: The forumla used to calculate the CI is

. Here, t should be given by the t-score for the 95% confidence level on 199 degrees of freedom. We will use 200 degrees of freedom because the t-table does not have a 199 df line. This t-score is not the t test statistic. (b) Interpret this interval in context. We are 95% confident that on the reading test students score, on average, 1.78 points lower to 0.69 points higher than they do on the writing test. (c) Does the confidence interval provide convincing evidence that there is a real difference in the average scores? Explain. No, since 0 is included in the interval.

Practice Problem: 5.20 We considered the differences between the temperature readings in January 1 of 1968 and 2008 at 51 locations in the continental US in Exercise 5.19. The mean and standard deviation of the reported differences are 1.1 degrees and 4.9 degrees. (a) Calculate a 90% confidence interval for the average difference between the temperature mea- surements between 1968 and 2008. (b) Interpret this interval in context. (c) Does the confidence interval provide convincing evidence that the temperature was higher in 2008 than in 1968 in the continental US? Explain. (a) (-0.05, 2.25). (b) We are 90% confident that the average daily high on January 1, 2008 in the continental US was 0.05 degrees lower to 2.25 degrees higher than the average daily high on January 1, 1968. (c) No, since 0 is included in the interval....


Similar Free PDFs