Zoom Session 1 Stat 200 Lecture NOtes PDF

Title Zoom Session 1 Stat 200 Lecture NOtes
Course Introduction to Statistics
Institution University of Maryland Global Campus
Pages 10
File Size 511.4 KB
File Type PDF
Total Downloads 51
Total Views 140

Summary

Zoom Session 1 Stat 200 Lecture NOtes Data are arranged in rows and columns. There is one row for each
observational unit or individual and a column for each variable.
Individual – a person or object that you are interested in finding out information about....


Description

How data are arranged

Data are arranged in rows and columns. There is one row for each observational unit or individual and a column for each variable. Individual – a person or object that you are interested in finding out information about. Variable – the measurement or observation of the individual. Scenario: Here the researcher was interested in collecting data about the health of patients. She did a random sample of 14 patients, write down their gender, measured their BMI, and asked them about their age.

We had surveyed 14 students, listed 1 through 14.

Variables: Sex, BMI, Age were collected for each person. For person 1, she is female (sex 1 is male, sex 2 is for female), underweight and age 78. Population-complete set, all the members of the group under study Sample-subset of the population For our example, the researcher collected a sample of her patients, sample of 14 from the population of all her patients. Suppose she has 1000 patients on her roll (population). However, she did not have the finances and time to collect information on everyone. Therefore, she took information from a subset (smaller pool) of her patients, sample.

Parameter – a number calculated from the population. Usually denoted with a Greek letter. This number is a fixed, unknown number that you want to find. Statistic – a number calculated from the sample. Usually denoted with letters from the Latin alphabet, though sometimes there is a Greek letter with a ^ (called a hat) above it. Since you can find samples, it is readily known, though it changes depending on the sample taken. It is used to estimate the parameter value.

Once we collect the information on the sample of patients we wish to summarize the data we collect and make inferences or guesses about the population. Summarized data from a population is called a parameter. Summarized data from a sample is called a statistic.

µ (mu, is parameter symbol for average)-the researcher might want to know the average age of all 1000 of her patients. x (is statistic, average from the sample, describes the sample)-the

researcher can only afford to do an average on the population, so an average is done on the sample, it is a statistic and is an estimate of the population average. The average age can change because the sample can change, we could take a sample of a different 14 people. x-bar=( 78+44+72+…+67)/14=838/14=59.85 Average client age is 59.85.

What if the researcher wants to take an average of sex and BMI? Average as we know it is a calculation from a special type of numerical number. Thus, the ability to summarize the data depends on the characteristic of the data. Variable Types Qualitative or categorical variable –describes a quality of the individual. Our example, Sex and BMI are categories that people belong to or a quality that describes that person. Quantitative or numerical variable –a number that can be counted or measured from the individual. Our example, Age is quantitative numerical variable that we collected. A quantitative variable can be either discrete or continuous. Discrete data are usually things you count (with our fingers), not decimal. Continuous data can take on any value, usually things you measure, and can be decimal.

Qualitative Variable Age, is it continuous or discrete? Age can take on any value along a number line 30.6 (30 years 6 months) so it can be decimal. It is not a count variable, like the number of cars in the parking lot…we can count the number of cars, 1, 2,3,… can’t have 2.5 cars. How can we change gender into a count variable?

Week # 1 2 3

Number of females 5 8 5

Number of males 8 6 8

Here the experimental unit or observational unit that we are interested in collecting data about is weeks, the rows. For each week, we collect the number of females and the number of males that comes to the clinic. Number of females, and the number of males are variables, they are count or discreet quantitative variables. Sex, from our previous example, also represented as a numbers like 1 or 2 were still categories. You are either 1 (Male) or 2 (Female). Based on if the data is qualitative or quantitative (continuous or discrete) that tells me how I can describe the data. Age, Continuous, I can get the average or mean to describe the data. BMI and Sex, qualitative, how can we describe average for qualitative data? Proportion or percentage.

Sex, 2 out of 14 are Female, 12 out of 14 Male.

p =2/14=.14

p =12/14=.86

p is a sample statistic that describes qualitative data. BMI has 3 categories: underweight (3), normal (6), overweight (5).

p   =3/14=.21 p =6/14=.43 p = 5/14=.36

After the initial classification of our variables into qualitative and quantitative we then care about their measurement scales (which tells us the statistics we can use): note, measurement scales operate like a step ladder (builds on previous).

Nominal – category or quality. Has no order and you cannot do any arithmetic like average on this level of data. Examples of this are gender, car name, ethnicity, and race. Ordinal – data that is nominal, but you can now put the data in order, since one value is more or less than another value. You cannot do arithmetic on this data, but you can now put data values in order. Examples of this are degrees (A.S., B.S., M.S., PhD) and size of a drink (small, medium, large). No equal distance. Distance (not equal between consecutive points along a line) High School and A.S. is 2 years B.S. and M.S. 1 year or 2 years M.S. and PhD 4 years to 6 years Interval – data that is ordinal, but is equal distance. You can do arithmetic on this data, but only addition and subtraction can’t do division. Example of this is temperature. Interval has an arbitrary zero not absolute. Does not mean the absence of something. Example: 0F does not mean an absence of temperature. 0F has an arbitrary definition defined by scientist. Daniel Fahrenheit- originally defined as the freezing temperature of a solution of brine made from a mixture of water, ice, and ammonium chloride.

Ratio – data that is interval, but has a meaningful zero. Zero means absence of. You can do division and all arithmetic on this data. Examples of this are height, weight, salary. Example: 0 salary, means absence of salary or income.

Looking back at our original example, Sex-has no order, can’t put on a number line, just a quality or description about a person. Women are not greater than (>) men even though women were given the number 2 and men 1. Nominal. BMI- categories but does have order to it. Underweight...


Similar Free PDFs