Sample/practice exam 2015, questions - Part A and B PDF

Title Sample/practice exam 2015, questions - Part A and B
Course Data Analysis
Institution University of Southern Queensland
Pages 19
File Size 569.2 KB
File Type PDF
Total Downloads 107
Total Views 151

Summary

Part A and B...


Description

STUDENT NO:

NAME:

UNIVERSITY OF SOUTHERN QUEENSLAND FACULTY OF SCIENCES Course No:

Course Name:

STA2300

Assessment No:

Internal External

× ×

Examiner:

This examination carries 50% of the total assessment for this course Moderator:

Examination Date: Time Allowed:

DATA ANALYSIS

EXAMPLE EXAMINATION THREE Perusal:

Working:

Ten (10) minutes Two (2) hours

Special Instructions: This is a RESTRICTED Exam Students ARE permitted to write on this examination paper during perusal. Students ARE permitted to bring into the examination one A4 sheet of paper, written or typed on one or both sides with any material the student wishes to include. Students are required to submit this sheet of paper to the supervisor with their examination paper. Students ARE permitted to use a scientific or graphics calculator(s) which cannot communicate with any other devices. Students ARE permitted to use an unmarked paper based translation dictionary. Students should attempt to answer all questions in Part A (20 marks) and Part B (30 marks). Statistical Tables and a Formula Sheet are included at the end of this exam. Please answer PartA on the CMA answer sheet and Part B in the spaces provided. All examination question papers must be submitted to supervisors at the end of every examination and returned to USQ. Students are not permitted to retain them. Any non-USQ copyright material used herein is reproduced under the provisions of Section 200(i)(b) of the Copyright Amendment Act, 1980.

Data Analysis STA2300 Example Exam 3

Page 2

PART A (20 marks) Attempt ALL questions in Part A. Answers to be indicated on the Yellow Answer Sheet provided. Question 1 A consumer research company wants to estimate the average cost of an airline ticket for a round trip within Australia. A random sample of 100 airfares was gathered giving an average price of $412. Identify the variable. (1) (2) (3) (4) (5)

Consumer research company Categorical $412 Explanatory Airline fare

Question 2 The subscribers to a magazine were divided into three different income categories and a random sample was selected from each category to survey about their favourite feature. What type of sampling technique best describes this? (1) (2) (3) (4) (5)

Cluster SRS Systematic Stratified Convenience

Question 3 Which type of graph makes it easy to graphically compare the centres and spreads of a quantitative variable for several different groups? (1) (2) (3) (4) (5)

Histograms Scatterplots Pie charts Stemplots Boxplots

Page 3

Data Analysis STA2300 Example Exam 3

Questions 4 – 5 refer to the following information The following display shows the scores received in Assignment 2 by the 24 students in a Data Analysis tutorial group.

Question 4 What is the median score for this distribution? (1) 84.5 (2) 79 (3) 8.45 (4) 7.9 (5) 12.5

Question 5 Which of the following best describes the spread of this skewed distribution? (1) mean (2) median (3) standard deviation (4) IQR (5) range

Data Analysis STA2300 Example Exam 3

Page 4

Question 6 A sample of ten USQ students are enrolled in the following number of courses this semester. No. of courses 1 2 3 4 No. of students 1 1 4 4 What is the mean number of courses enrolled in by these ten students? (1) 2.5 (2) 3 (3) 2.0 (4) 1.0 (5) 3.1

Question 7 What numbers are used in making a boxplot? (1) minimum, Q1 , mean, Q3 , maximum (2) minimum, Q1 , median, Q3 , maximum (3) s, mean, s2 , IQR (4) Q1 , Q2 , Q3 (5) minimum, IQR, maximum

Question 8 The proportion of voters in Australia who voted for the Labor party in the recent Federal election is best called (1) a statistic (2) a parameter (3) an observation (4) a quantitative variable (5) a categorical variable

Page 5

Data Analysis STA2300 Example Exam 3

Question 9 If a student who hadn’t studied all semester guessed all the answers to the 20 multichoice questions in this exam, which probability model would be appropriate to determine the probability of getting 10 or more correct? (1) (2) (3) (4) (5)

binomial with n = 20, p = 0.5 normal with µ = 2, σ = 1.789 binomial with n = 20, p = 0.2 normal with µ = 10, σ = 2.236 binomial with n = 10, p = 0.5

Question 10 The delivery time of a package sent within Australia has an approximate normal model with mean 4 days and standard deviation 1 day. If 300 packages are being sent, how many packages will arrive in less than 3 days. (1) (2) (3) (4) (5)

8 48 96 102 198

Question 11 When we retain the null hypothesis involving a comparison between two means we (1) (2) (3) (4) (5)

claim that a significant difference exists. claim the two means are equal. have committed a Type II error. can conclude that sampling error is responsible for the observed difference. have committed a Type I error.

Data Analysis STA2300 Example Exam 3

Page 6

Question 12 You test the hypotheses H0 : µ = µ0 and Ha : µ 6= µ0 based on an SRS of size n from a normal population with unknown σ. You observe a sample mean of y and find the result is statistically significant at level α = 0.01. You may safely conclude which of the following? (1) While this does not prove the results are practically significant, there is only a 0.01 probability that the results are not practically significant. (2) While we cannot be sure that the results are practically significant, we can be sure that y and µ0 differ by a sizeable amount. (3) The P -value is less than or equal to 0.01. (4) The P -value is greater than or equal to 0.99. (5) The null hypothesis is 1% true at best.

Question 13 The graph below displays a relationship between the variables height and weight. Which equation best describes the regression line shown?

(1) (2) (3) (4) (5)

d weight = 0.618 + 149.5 weight d weight = 149.5 + 0.618 height d height = 0.618 + 139.2 weight d height = 139.2 + 0.618 weight d height = 149.5 + 0.618 weight

Page 7

Data Analysis STA2300 Example Exam 3

Question 14 For a sample of 50 Australian family sedans up to 15 years old, the correlation between the ages of the vehicles and their selling prices is −0.78. Interpret the meaning of R2 in this context. (1) (2) (3) (4) (5)

Age of the vehicle explains 78% of variation in sale price. Age of the vehicle explains 61% of variation in sale price. As age increases, the sale price decreases at the rate of 78%. As age increases, the sale price decreases at the rate of 61%. There is only a 61% probability that age is causing price to decrease.

Question 15 In a clinical trial each participant receives a placebo, a low dosage of a drug, or a high dosage of a drug. The participants consist of 90 men and 90 women. The 90 men are randomly divided into three groups of 30 men each. Each group of men is assigned to a different treatment (placebo, low dose, high dose). Likewise, the 90 women are randomly divided into three groups of 30 women each. Each group of women is assigned to a different treatment (placebo, low dose, high dose). What type of design is this? (1) (2) (3) (4) (5)

A A A A A

randomised block design. completely randomised design. matched pairs design prospective observational study. stratified sampling design.

Question 16 A survey investigating whether the proportion of employees who commute to work by car is higher than it was five years ago finds a P -value of 0.015. Is it reasonable to conclude that more employees are commuting by car now than previously? (1) No, because there is only a 1.5% probability that this claim is true. (2) Yes, because there is only a 1.5% probability that this claim is false. (3) No, because if there is a difference, there is only a 1.5% chance that sampling variation is the cause. (4) Yes, because if there is no difference, there is only a 1.5% chance that sampling variation is the cause of the observed difference. (5) No, because the difference is only 1.5% and although this may be statistically significant, it is of no practical significance.

Data Analysis STA2300 Example Exam 3

Page 8

Questions 17 – 19 refer to the following information The accident history and shade of colour of a sample of 500 vehicles were recorded as follows: Shade of colour Dark Light Been involved in accident Not been invovled in accident

120 180

60 140

The following output was produced by SPSS for these data.

Question 17 What percentage of vehicles are darkly coloured and have been in an accident? (1) (2) (3) (4) (5)

67% 40% 24% 36% 60%

Page 9

Data Analysis STA2300 Example Exam 3

Question 18 Assuming no association exists, the expected number of vehicles which have been involved in an accident and are light in colour is (1) (2) (3) (4) (5)

108 60 90 72 125

Question 19 Which of these statements would be the best conclusion from the analysis of these data? (1) The shade of colour of a vehicle is a factor in causing accidents. (2) The shade of colour of a vehicle is not a factor in causing accidents. (3) There is no significant association between colour shade and accident status at the 10% level of significance. (4) There is a significant association between colour shade and accident status at the 1% level of significance. (5) There is a significant association between colour shade and accident status at the 5% level of significance.

Question 20 The presence of an association between two categorical variables is indicated by (1) (2) (3) (4)

significant differences in the conditional distributions of the rows of the contingency table. significant differences in the joint distributions of the two variables in the contingency table. non-significant differences in the marginal distributions of the contingency table. non-significant differences in the conditional distributions of the rows of the contingency table. (5) non-significant differences between conditional distributions of the rows and conditional distributions of the columns of the contingency table.

Data Analysis STA2300 Example Exam 3

Page 10

PART B (30 marks) Attempt ALL questions in Part B. Answers must be written in the space provided below each question or by circling a response as required. Only answers written in this space will be marked. Part marks may be given for partially correct answers.

Question 1

(3 marks)

(a) What is a distribution?

(b) What is the difference between a statistic and a parameter?

(c) What is meant by a nonparametric procedure?

STUDENT NO:

NAME:

Data Analysis STA2300 Example Exam 3

Page 11

Question 2

(3 marks)

Data on 100 male and 100 female elite athletes collected at the Australian Institute of Sport included measurements of body mass index (weight in kg/square of height in m). These data are displayed in Figure 1.

Figure 1: Side by side boxplots of body mass index for male and female athletes.

Using the graph in Figure 1, compare in about 50 words, the distributions of body mass indexes for the male and female athletes.

STUDENT NO:

NAME:

Data Analysis STA2300 Example Exam 3

Page 12

Question 3

(8 marks)

A researcher wishes to determine whether people with high blood pressure can reduce their blood pressure by following the Johnson diet. A sample of 200 people with elevated blood pressure were randomly divided into two groups, one being placed on the Johnson diet and the other retained as a control group. The diastolic blood pressure was recorded at the end of the experimental period for each subject and a summary of results follows. Sample size† Mean Standard deviation

Treatment group 85 105.1 8.7

Control group 75 116.6 12.7

† Some subjects withdrew or did not comply with the conditions of the study and their results have been withdrawn.

Is there sufficient evidence to claim that the mean diastolic blood pressure of subjects on the Johnson diet is lower than that of the control subjects? (a) State appropriate null and alternative hypotheses for this test. Define any terms used.

(b) Calculate the appropriate test statistic and state, if appropriate, the degrees of freedom.

STUDENT NO:

NAME:

Data Analysis STA2300 Example Exam 3

Page 13

(c) Give an approximate P -value and state your conclusion.

(d) From this study only, can we conclude that the Johnson diet reduces blood pressure. Explain why or why not in 50 words or less?

STUDENT NO:

NAME:

Data Analysis STA2300 Example Exam 3

Page 14

Question 4

(4 marks)

A study to relate the number of years of experience of ten traffic police to the average number of tickets they give per week resulted in the following analysis:

(a) Write down the least squares regression equation appropriate for estimating average number of tickets per week from years of experience. Define any variables involved.

(b) Give an interpretation of the value for the slope of the regression line in context.

STUDENT NO:

NAME:

Data Analysis STA2300 Example Exam 3

Page 15

Question 5

(4 marks)

The manufacturer of a coffee dispensing machine claims the grams per cup has mean 178g and standard deviation 20g. (a) If 40 cups of coffee are dispensed and measured, what is the probability that the mean amount dispensed is less than 170g?

(b) Suppose we wish to estimate the mean amount of coffee dispensed by the machine to within 2g with 95% confidence. How many cups will need to be sampled?

STUDENT NO:

NAME:

Data Analysis STA2300 Example Exam 3

Page 16

Question 6

(5 marks)

A drug manufacturer states that only 5% of the patients using a particular drug will experience side effects. Doctors at a large hospital use the drug in treating patients. (a) Of the next 10 patients that get treated with this drug, what is the probability that at least one will experience side effects?

(b) The hospital expects to treat 250 patients with this drug in the next year. What is the probability that more than 15 of these 250 patients will experience side effects?

STUDENT NO:

NAME:

Data Analysis STA2300 Example Exam 3

Page 17

Question 7

(3 marks)

How long does it take to commute from home to university? It depends on several factors including route, traffic, and time of departure. The data below are results in minutes of a random sample of 8 trips. Use these data to create a 95% confidence interval for the true mean time to commute to university. 27 38 30 42 24 37 30 39

STUDENT NO:

NAME:

Page 18

DATA ANALYSIS STA2300 Example Exam 3

STATISTICAL TABLES NOT REPRODUCED

DATA ANALYSIS STA2300 Example Exam 3

Formula Sheet • Summary statistics

P

y y= rP n (y − y)2 s= n−1 y−µ z= σ x pb = n • Confidence intervals

statistic ± critical value × SE(statistic)

• Hypothesis tests

test statistic =

Parameter

Statistic

p µ

pb y

µ1 − µ2

y1 − y2

µd

d

statistic − parameter SD(statistic)

SD(statistic) SE(statistic) q q pq pb qb n n √s √σ r n r n 2 2 σ1 σ2 s12 s22 n1 + n2 n1 + n2 sd σd √ √ n n

• Sample sizes z ∗2 s2 ME2 z ∗2 pb qb n= ME2 n=

• Contingency tables row total × column total table total X (obs − exp)2 χ2 = exp

expected count =

Page 19...


Similar Free PDFs