Assignment 1 PDF

Title Assignment 1
Course Introduction to Statistics
Institution Athabasca University
Pages 10
File Size 477.4 KB
File Type PDF
Total Downloads 86
Total Views 163

Summary

Completed first assignment....


Description

(Revision 10)

Assignment 1 Overview / 75

Total marks:

This assignment covers content from Unit 1 of the course. It assesses your knowledge of many of the statistical terms that will be used throughout the course, as well as your ability to organize, display and summarize data.

Instructions • •

Show all your work and justify all of your answers and conclusions, except for the TRUE/FALSE questions. Keep your work to 4 decimals, unless otherwise stated.

(22 total marks) 1. Observations from the first 25 days of the year showed that the following number of pedestrians used a particular crosswalk on each day: 19 132 25 78 62 101 84 32 99 71 43 56 64 81 59 67 43 97 105 91 31 66 76 87 125 (8 marks) Construct a frequency distribution for this data using a lower limit for the first class of 15 and a class width of 20. Indicate the class limits, boundaries, midpoints, frequencies and cumulative frequencies.

Class Limits

Frequencies

Midpoints

Boundaries

Cumulative Frequencies

Relative Frequencies

15-34

4

24.5

14.5-34.5

4

16%

35-54

2

44.5

34.5-54.5

6

8%

55-74

7

64.5

54.5-74.5

13

28%

75-94

6

84.5

74.5-94.5

19

24%

95-114

4

104.5

94.5-114.5

23

16%

115-134

2

124.5

114.5-134.5

25

8%

∑f=25

Mathematics 215: Introduction to Statistics

Assignment 1

1

Lower limit for the first class: 15 Class width: 20 Frequencies: I organized the values into the class limits, then counted the number of values for the frequency of each group. 15-34 19, 25, 32, 31 = 4 35-54 43, 43 = 2 55-74 62, 71, 56, 64, 59, 67, 66 =7 75-94 78, 84, 81, 91, 76, 87 = 6 95-114 101, 99, 97, 105 = 4 115-134 132, 125 = 2 Midpoints: upper class limit + lower class limit 2 15+34/2=24.5 35+54/2=44.5 55+74/2=64.5 75+94/2=84.5 95+114/2=104.5 115+134/2=124.5 Cumulative frequencies: 15-34: 4 35-54: 4+2=6 55-74: 4+2+7=13 75-94: 4+2+7+6=19 95-114: 4+2+7+6+4=23 115-134: 4+2+7+6+4+2=25 Relative frequencies: Class frequency ∑f 4/25=0.16=16% 2/25=0.08=8% 7/25=0.28=28% 6/25=0.24=24% 4/25=0.16=16% 2/25=0.08=8%

Mathematics 215: Introduction to Statistics

Assignment 1

2

(Revision 10) (4 marks)z Create a relative frequency polygon of the number of pedestrians.

(4 marks) Create a percentage ogive of the number of pedestrians.

Mathematics 215: Introduction to Statistics

Assignment 1

3

Cumulative relative frequencies: 16 16+8=24 16+8+28=52 16+8+28+24=76 16+8+28+24+16=92 16+8+28+24+16+8=100

(3 marks) Calculate the approximate value of the 30th percentile. What does this number tell you?

𝑃𝑘 = 100 𝑘⋅𝑛

𝑃30 =

30⋅25 100

=

750

100

= 7.5

First I arranged the values in order, then I filled in the equation and solved it. This means that the value of the 30th percentile is the 7.5th value. We would then round up 7.5 so the approximate value of the 30th percentile is the 8th value = 59. The number tells you that on 30% of the days there are less than 59 pedestrians.

(3 marks) Circle True (T) or False (F) for each of the following statements with reference to the original dataset information given in this question. T

F The number 132 represents an element.

T

F The information provided represents the use of cross-section data.

T

F The number of pedestrians is a continuous variable.

(4 marks) 2. Indicate whether each of the following examples describes the use of descriptive (D) or inferential (I) statistics by clearly circling D or I. D I

In my son’s grade-three classroom, 25% of the kids are under the age of eight and 75% are eight or older.

D I

A comparison between placebo and drug treatments has determined that drug X produces approximately a 5% reduction in the severity of symptoms of Alzheimer’s patients.

D I

A non-leap year consists of 525,600 minutes.

D I

By examining the browsing habits of individuals last week, a well-known Internet retailer has estimated that 13% of those people who visit their website will make a purchase.

(5 marks)

Mathematics 215: Introduction to Statistics

Assignment 1

4

(Revision 10) 3. An average glass of milk contains 115 grams of calcium with a standard deviation of 8 grams. Using Chebyshev’s theorem, construct an interval that contains the calcium content of at least 50% of the glasses of milk. Retain 2 decimal places of accuracy in all calculations.

Mathematics 215: Introduction to Statistics

Assignment 1

5

(11 total marks) 4. The monthly cell phone bills for 100 senior citizens are summarized in the following frequency distribution table: Cell Phone Bill (in dollars) 10 to less than 20 20 to less than 30 30 to less than 40 40 to less than 50 50 to less than 60 60 to less than 70 Total

Frequency   1   9  10  26  43  11 n=100

m 15

f∙m (1∙9)15

m²f (225∙1)225

Cumulative f 1

25 35

(9∙25)225 (10∙35)350

(625∙9)5625 (1225∙10)12250

10 20

45 55 65 ∑m=240

(26∙45)1170 (43∙55)2365 (11∙65)715 ∑fm=4840

(2025∙26)52650 46 (3025∙43)130075 89 (4225∙11)46475 100 ∑m²f=247300

(3 marks) Using the frequency distribution table, calculate the mean cell phone bill amount. Note: You may add columns to the table to assist you with your work. Mean cell phone bill amount: 𝑥 =

𝑥 =

∑𝑓 ⋅ 𝑥 𝑛

4840 = 𝟒𝟖. 𝟒 100

Midpoint= upper class limit+lower class limit 2 First class:

10+20 2

= 15

(3 marks) Using the frequency distribution table, and treating the data as population data, calculate the standard deviation of cell phone bill amount. Note: You may add columns to the table to assist you with your work. µ=48.4 N=100

𝜎=



247300 − ( 100

48402 100 )

23425600 √247300 − ( 100 ) = √247300 − 234256 = 100 100

𝜎 = √130 .44 = 𝟏𝟏. 𝟒𝟐𝟏𝟎

Mathematics 215: Introduction to Statistics

Assignment 1

6

(Revision 10)

(2 marks) Estimate the class range that contains the median cell phone bill amount. There are 100 values in this data set so the median is 50.5 According to the cumulative frequency, 50.5 is closest to 46 so it would most likely be in the 40 to less than 50 class (2 marks) Does the data appear to be skewed? If yes, in what direction? Describe how you came to this conclusion. The mean is 48.4 and the classes start at 10 and end at 70, so the graph would look something like this:

Graph isn’t fully accurate, it’s just there to show that the data appears to be skewed left. (1 mark) Circle True (T) or False (F) for the following statement: T

F Selecting 100 senior citizens out of 500 who live in a particular town in order to calculate the mean cell phone bill of seniors in the town is an example of sampling without replacement.

Mathematics 215: Introduction to Statistics

Assignment 1

7

(33 total marks) 5. The following data represents the number of cigarettes smoked per day for a sample of 18 teenagers, each of whom identifies as a smoker: 26

15

21

26

9

31

20

30

21

0

19

25

16

32

14

27

28

18

(4 marks) Construct a stem-and-leaf display for this data. Place the leaves in ascending order. S

L

0

09

1

45689

2

01156678

3

012 =378

(2 marks) What is an advantage of using a stem-and-leaf display for this data as compared to a histogram? A stem and leaf plot shows you all of the data values, histograms do not. (2 marks) Calculate the mean number of cigarettes smoked. 378/18= 21 (7 marks) Calculate the standard deviation and coefficient of variation for the number of cigarettes smoked. 𝑠=√

𝑛𝛴(𝑥 2 ) − (∑𝑥)2 𝑛(𝑛 − 1)

𝐶. 𝑉. =

𝑠 ⋅ 100 𝑥

𝑠=√

𝐶𝑉 =

18.9100 − 142884 20916 =√ = √68.3524 = 8.2676 306 306

8.2676 ⋅ 100 = 0.3937 ⋅ 100 = 39.37% 21

(5 marks) Calculate the quartiles and interquartile range for this data. 0, 9, 14, 15, 16, 18, 19, 20, 21, 21, 25, 26, 26, 27, 28, 30, 31, 32 Q1: 16 M/Q2: 21 (21+21/2=21) Q3: 27 IQR=Q3-Q1=27-16=11

Mathematics 215: Introduction to Statistics

Assignment 1

8

(Revision 10) (4 marks) Sketch a box-and-whisker plot on the graph paper below. Indicate any outliers.

To check if there are any outliers: IQR=11 1.5∙11= 16.5 16-16.5= -0.5 27+16.5= 43.5 There are no outliers because all of the data values are within -0.05 and 43.5 (2 marks) What is an advantage of representing the data using a box-and-whisker plot as compared to a stem-and-leaf display? Box and whisker allows you to see the center, spread and skew easily. It is a also a lot easier to spot outliers than on a stem and leaf display. (3 marks) What is the percentile rank of a teenager who smokes 31 cigarettes per day? What does this number tell you? 16 teenagers in the sample smoke less than 31 18 teenagers total 16 ⋅ 18

100 = 𝟖𝟖. 𝟖𝟗%

This shows that 11.11% of the sample smokes more than 31 cigarettes per day (3 marks) Circle True (T) or False (F) for each of the following statements with reference to the sample information given in this question: T

F If the sample of teenagers is selected in a way that ensures that each teenager has an equal chance of being selected, this is called a simple random sample.

T

F The fact that the mean number of cigarettes smoked per day for the sample will differ from the population mean number of cigarettes smoked per day, because the sample is a subset of the population, leads to a non-sampling error.

Mathematics 215: Introduction to Statistics

Assignment 1

9

T

F If the sample of 18 teenagers is selected in a way that ensures that 9 of the teenagers are randomly selected from the male smokers and the other 9 teenagers are randomly selected from the female smokers, this is an example of systematic random sampling.

These extra pages are for additional calculations. If you need them for your solutions, please reference them in the appropriate place in the questions.

Mathematics 215: Introduction to Statistics

Assignment 1

10...


Similar Free PDFs