Lab1 - Lab PDF

Title	Lab1 - Lab
Course	Statistical Modelling
Institution	University College Dublin
Pages	9
File Size	431.7 KB
File Type	PDF
Total Downloads	102
Total Views	178

Preview

CLICK TO PREVIEW PDF

Summary

Lab...

Description

STAT10060 – Statistical Modelling

Minitab - Lab 1

Part 1: Introduction to Minitab - Data Entry & Data Manipulation

1. Log into a desktop machine in the lab by entering your username and password.

2. The software used in this module is called Minitab and can be accessed by going to the Start button, and then choosing All Programs -> Teaching Applications -> Mathematics and Statistics -> Minitab. At home/off campus you can access Minitab by logging into UCD Connect and accessing `Software for U’ on the `Home’ tab.

3. When you launch MINITAB you should see a session window & worksheet of empty cells (into which you can enter your data).

Session Window

Data Window (Worksheet)

Input the following data into the columns labelled C1, C2, C3 and C4 . To enter data, click on a cell with the cursor, type in the number and press RETURN.

2365

92

82

96

4261

75

96

83

3929

98

60

72

2677

62

55

40

2417

79

72

81

3804

81

70

78 1

4. You can toggle between the data window and session window by placing your cursor over either and clicking once. To activate the session window, click once on the Editor option in the menu list along the top of the screen. A list of choices should now appear underneath the Editor. Choose Enable Commands. Note that you will be presented with a prompt which looks something like this MTB > . Now type the command PRINT C1 - C4 in the session window and press return.

In MINITAB the data window contains your data and the session window is where you type in all of your commands.

5. Go back to the data window, and insert a new row between rows 1 and 2 containing the data 4879

84

84

80

by clicking on any cell in row 2 and then clicking on Editor->Insert Rows. You should now have a row of blank cells. Type the data above into those cells.

6. The data you have entered are exam scores for seven students. Give the name ID to column 1 by clicking on the grey blank cell under the column name C1 and above the first row of data, typing ID and pressing RETURN. Similarly give names EXAM1, EXAM2 and EXAM3 to columns C2, C3 and C4 respectively.

7. Replace the entry in the 5th row of column 3 with 79 by clicking on the cell and typing 79. [Or you can use the command LET C3(5)=79 followed by RETURN in the session window.]

8. To displays some descriptive statistics select the STAT menu, click on BASIC STATISTICS and then DISPLAY DESCRIPTIVE STATISTICS. Click on the Variables box enter the variables EXAM2 and EXAM3 by double clicking on them. Click OK.

MINITAB will now print the results for each variable in the session window. You will see N - the number in the sample, N* - the number of missing values, Mean - the sample mean, Stdev - the standard deviation of the sample, SE Mean - the standard error of the mean, Minimum – the 2

minimum value, Q1 - the first quartile, Median – the sample median, Q3 – the third quartile and Maximum – the maximum value. 9. Construct a column C5 containing the total of the three exam scores for each student. Type LET C5 = C2 + C3 + C4 in the session window and press return. Give a name to this variable e.g. call it TOTAL. 10. Now sort the records for the students in descending order of total score. Click on the Data menu and click Sort. Click inside the Sort Column box. Now select all the variables by double clicking each variable in turn in from the list on the left hand side. These variables should now appear in the Sort Column box. Click on the first Sort By Column box and enter the variable TOTAL by double clicking on it. Click the Descending Button. In the second Sort By Column box enter the variable ID in the same way.

Store the sorted data in the original columns by clicking on the Original

Column(s) button. Click on OK.

This will sort the student in descending order of total score and (should 2 pupils have the same score) in ascending order of ID number.

11. Get an average of the 3 exam results for each student. Go to the command prompt in the Session window and type LET C6 = C5/3. Name this variable AVERAGE.

12. Now create a new variable called GRADE in column C7. Apply the following grades to each student based on their average mark over the 3 exams. Go to the Data -> Code -> Numeric to Text. In the Code Data from Columns box select AVERAGE. In the Into Column box select GRADE. Enter the following grading structure into the remaining boxes and click OK. Original values

New

0:39

E

40:54

D

55:69

C 3

70:84

B

85:100

A

Notice that one student has not been graded. Explain why this is and suggest a remedy.

13. We now want to round the average score to the nearest percent. Type the following in the session window, LET C6 = round(C6,0). The zero here indicates round to 0 decimal places.

Notice that the average scores are now displayed as whole percentages. Repeat the grading procedure. Describe and explain any difference in this result from last time.

14. To save the work you have done today, we are going to create a new folder in your H drive called ‘Minitab’. Double click on the ‘Computer’ icon on the desktop, and choose your H drive. Create a new folder and call this folder ‘Minitab’.

Go back to Minitab and save the Minitab project you have created under the name LAB1.MTW on your personal directory by going to File->Save Project As. Make sure the directory you are saving to is h:\Minitab. (If you don’t know how to change to this subdirectory ask someone to show you how.) Type LAB1 for ‘File Name’ and check that ‘Minitab Project (.MPJ)’ is the ‘Save as Type’ option. Click on OK.

16. Now, leave MINITAB by clicking on File -> Exit.

17.Try reopening Minitab and the ‘Project’ you have just saved.

Homework (NOT TO BE HANDED UP): The lecturer for these students wishes to award grades in a particular way. Instead of treating each exam separately she wishes to give double weighting to EXAM1 (i.e. treat EXAM1 as if it were equivalent to 2 exams). She has decided to award the following grades based on a recalculated average of the exam results with this in mind. Original values

New

0:39

Fail

40:54

Pass

55:69

Hons.2

70:100

Hons.1 4

Use Mintab to calculate this new average and place them in an unused column. Name this new variable appropriately. Apply this new grading structure and store in a new (appropriately named) variable.

Part 1 Summary:

After this lab you should be able to: -

Enter data in the data window in Minitab.

-

Label variables.

-

Insert a new row of data in a dataset in the data window.

-

Activate the session window and enable commands.

-

Create new variables (including coded variables).

-

Sort data in numerical / alphabetical order.

-

Round off a variable (to 0, 1, 2 etc decimal places).

-

Get descriptive statistics.

5

STAT10060 – Statistical Modelling

Minitab - Lab 1

Part 2: Hypothesis Testing - Comparing Two Population Means With hypothesis tests we are interested in making inferences about a population parameter of interest e.g. the mean or a proportion. We can use hypothesis testing to answer many questions. For example, we might want to know if one drug is better than another at reducing blood pressure, or whether the presence of air bags in a vehicle reduces the severity of injuries in a traffic accident accident. The steps for conducting a hypothesis should be stated each time one is performed.

Elements of Hypothesis Testing (Summary from lecture notes) 1) Choose the population characteristics of interest (e.g. D 0 , the difference between 1 (the mean of population 1) and  2 (the mean of population 2)). State the Null Hypothesis (Ho:) We begin by stating the null hypothesis (Ho). For example, we might state that there is no difference between the means of population 1 and population 2. Ho: 1 - 2 = 0 State the Alternative Hypothesis (Ha:) We choose an alternative hypothesis (Ha). For example, we might state that the alternative hypothesis is that a difference exists between the means of population 1 and population 2. Ha: 1 - 2 ≠ 0 2) Choose a significance level for the test (e.g. α = 0.05 i.e. there is a 5% chance of incorrectly rejecting the null hypothesis (which is also called a Type I error)). 3) State any assumptions 4) Calculate the test statistic 5) Find the rejection region We find the rejection region such that the probability of rejecting the null hypothesis incorrectly is equal to α. 6) Compare the test statistic to the rejection region Examine whether the test statistic falls inside or outside the rejection region. 7)

Conclusion Draw a conclusion i.e. either reject Ho or fail to reject Ho.

8) Interpret the conclusion in terms of the question. State your conclusion in the context of the question you are trying to answer.

6

Comparing two population means - large samples. 1.

When testing hypotheses about the difference between two population means when we have large samples, we can use the Central Limit Theorem to tell us about the sampling distribution of ( x1 − x2 ). Summary From Lecture Notes

( x 1 − x2 )

1. The mean of the sampling distribution of

(1 − 2 ) .

is

2. If the two samples are independent, the standard deviation of the sampling distribution is

σ (x where

σ 12 ,σ 22

1 −x2

)

=

σ 12

σ 22

s12 s22 + ≈ + n1 n2 n1 n2

are the variances of the two populations, n1, n2 are the sample sizes of the two

samples respectively and s 12 ,

s 22

the standard error of the statistic

are the variances of the two samples.

σ (x − x 1

2

) is also known as

( x1 − x 2 ) .

3. By the central limit theorem the sampling distribution of

(x1 − x 2 ) is approximately normal

for large samples (i.e. where n1 and n2 are both ≥ 30). 4. A large sample confidence interval for

( 1 − 2 ) is therefore:

( x1 − x2 ) ± z crit σ( x −x ) 1

2

= ( x 1 − x 2 ) ± z crit

σ 21 σ 22 + n1 n 2

≈ ( x 1 − x 2 ) ± zcrit

s12 s22 + n1 n2

On the Blackboard you will find a data set called Lab_IQ.mtw. This data set contains two variables: IQ Group 1 are the IQ scores for children in their first year of school who had attended pre-school for 1 year, IQ Group 2 are the IQ scores for children in their first year of school

who had not attended any pre-school. An educational psychologist wishes to

investigate whether the data provide any evidence that children who attend pre-school display higher IQ scores. Here are the familiar steps in hypothesis testing: Step 1. Choose the population characteristic of interest. D 0 - the difference between the population means (i.e.  1 - 2) State null hypothesis. Ho: D 0 = 0 (i.e. 1 - 2 = 0 ) State alternative hypothesis. Ha: D 0 > 0 (i.e.  1 - 2 > 0 ). Step 2. Choose the significance level. α = 0.10 (or 10% level). Step 3. State any assumptions Large samples. Step 4. Choose and calculate the test statistic. In this case the test statistic is: 7

z=

( x1 − x2 ) − D0 ≈ (x1 − x 2 ) − D0 σ12 n1

+

σ 22

s 21 s 22 + n1 n2

n2

Using the descriptive statistics (see lab 1), calculate the test statistic:

z=

(x 1 − x 2 ) − D 0 σ

2 1

n1

+

σ

2 2

≈

(x 1 − x 2 ) − D 0

n2

s12 s 22 + n1 n2

= ______________________________

Step 5. Choose a rejection region Since the alternative hypothesis includes only differences in means greater than 0, this is a one-tailed test. The rejection region will be in the upper tail of the standard normal distribution. Therefore we need to get the z critical value which is the value such that 10% of the standard normal distribution is to the right and therefore 90% is to the left.

90%

10%

Require the z critical value from a standard normal such that 10% is to the right and 90% to the left.

We can get this value in Minitab with the following commands (recall how to enable commands from lab 1). MTB > INVCDF .90; SUBC> NORMAL 0 1.

Equivalently, we can go to Calc -> Probability Distributions -> Normal. We need to check the `Inverse cumulative probability’ option, ensure we’re asking Minitab about the normal with mean 0 and standard deviation, and tell Minitab that we want it to evaluate this function at 0.9.

What is the z critical value in this case? _____________ Verify this in the New Cambridge Statistical Tables. 8

Step 6, Step 7, Step 8. Compare the test statistic to the critical value, make a conclusion and state the conclusion in the context of the question The test statistic is lower/greater than the critical value hence we reject/fail to reject the Ho at α = _______ and conclude that ________________________________________________________________ _________________________________________________________________ Now let’s additionally calculate a two sided 90% confidence interval for the difference between the 2 population means. Careful – what’s the relevant z critical value here? z critical value = ___________________ 90% Confidence Interval = (____________________, _____________________) Part 2 Summary:

After this lab you should be able to: -

perform a hypothesis test for two large independent samples

END

Homework Assignment sheet 1 due week 4

9...