Random Sampling PDF

Title Random Sampling
Course Elementary Statistics I
Institution Texas Woman's University
Pages 9
File Size 57.9 KB
File Type PDF
Total Downloads 60
Total Views 149

Summary

week two we go over random sampling during lectures...


Description

Random Sampling Much of statistics concerns generalizing about observed findings from a sample to a larger population. This topic explores methods for selecting a sample so that trends and patterns observed in the sample can be reasonably generalized to the larger population of interest. An important statistical property known as sampling variability refers to the fact that the values of sample statistics vary from sample to sample. The precision of a sample statistic refers to how much the values vary from sample to sample. Precision is related to sample size. Sample statistics from larger samples are more precise and close together than those from smaller samples. Statistics from larger random samples, therefore, provide a more accurate estimate of the corresponding population parameter. Sampling Methods • Simple random sample - A sample of n subjects selected in such a way that every possible sample of the same size n has the same chance of being chosen. • Systematic sample - A sample in which the researcher selects some starting point and then select every kth element in the population.

• Stratified sample - A sample in which the researcher subdivides the population into at least two different subgroups (or strata), and then draws a sample from each subgroup. • Cluster sample - A sample in which the researcher first divides the population into sections (or clusters), and then randomly selects all members from some of those clusters. • Multistage Sampling - Using multiple methods discussed to sample from a population. 1 Example 1: Taxpayers in a population are listed in order of increasing income. In each situation, determine which type of sampling method was used. 1. A sample is selected by first separating the taxpayers into 4 groups based on income, and then randomly sampling 50 people from each group. 2. A sample is selected by randomly choosing one of the first 100 names, then choosing every 100th name from that point forward. 3. A random number generator in excel is used to select a sample of size 100. 4. A sample is selected by first separating the taxpayers into 4 groups based on income, and then sampling all members in two of the groups. 2

Example 2. Identify the type of sampling method used. (a) To estimate the percentage of defects in a recent manufacturing batch, a quality-control manager at Intel selects every 8th chip that comes off the assembly line starting with the 3rd until she obtains a sample of 140 chips. (b) To determine the prevalence of human growth hormone (HGH) use among high school varsity baseball players, the State Athletic Commission randomly selects 50 high schools. All members of the selected high schools’ varsity baseball teams are tested for HGH. (c) A member of Congress wishes to determine her constituency’s opinion regarding estate taxes. She divides her constituency into three income classes: low-income households, middle-income households, and upper-income households. She then takes a simple random sample of households from each income class. (d) In an effort to identify if an advertising campaign has been effective, a marketing firm conducts a nationwide poll by randomly selecting individuals from a list of known users of the product. (e) First, divide the country into “primary sampling units” (PSUs) based on size and

population. Then, sample from each PSU and divide those into “ultimate sampling units” (USUs). Finally, randomly sample from each USU. (f) A farmer divides his orchard into 50 subsections, randomly selects 4, and samples all the trees within the 4 subsections to approximate the yield of his orchard. (g) A school official divides the student population into five classes: freshman, sophomore, junior, senior, and graduate student. The official takes a simple random sample from each class and asks the members’ opinions regarding student services.

How to use the Table of Random Digits (Page T-1, in the back). Example 3. Consider the population of 268 words in Lincoln’s Gettysburg Address, which follows, in part: Four score and seven years ago, our father brought forth upon this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battlefield of that war... If we wanted to randomly select 10 words that represent this population of words, we could do so using a Random Digit Table.

1. First, number all the words in the speech. See below figure. 2. Then, pick a spot on the Random Number Table (it doesn’t matter where you begin). Look at the numbers as 3-digit numbers (since our numbering of the speech goes up to 268). 3. Then, record every 3-digit number as you go, until you’ve reached 10 different 3-digit numbers. • If you come across a 3-digit number greater than 268, skip it and keep going. • If a number repeats, keep going. 4. Those 10 numbers you’ll use in the first table to find the 10 words that will be used in your sample. 5 Topic 7: Displaying and Describing Distributions In this topic and in the next several, you will turn your attention to summarizing quantitative data. Previously in Topic 2, you learned about using dotplots for displaying the distribution of relatively small datasets of a quantitative variable, and you began to comment on the center and spread of the dsitribution. In this topic, you will discover some more key features of a distribution and also become

familiar with two new visual displays: stemplots and histograms. Summary Features of Quantitative Variables • Location (Center, Average) - measure of center. Examples: median, mean. • Spread (Variability, Consistency) - how spread out the data is. Examples: range, IQR, standard deviation. • Shape - are values clumped together, if so, where? • Outliers - a data point not consistent with the rest of the data set. Describing Shape • Symmetric - similar on both sides. • Skewed - values are more spread out on one side of the center than the other side. – Right Skewed - the tail extends to the right of the peak longer than to the left. – Left Skewed - the tail extends to the left of the peak longer than to the right. 6 • Stemplot - A plot in which every data value is shown. The “stem” shows the first digit of a number, and the “leaf” is a concatenated list of the last displayed digits of a number. – Strengths: great tool for sorting data; with sufficient sample size, can see shape. – Weaknesses: with large sample size, may be too cluttered. • Frequency - count of how many observations fall into a category. • Histogram - a graph consisting of bars of equal width drawn adjacent (they touch)

to each other. The horizontal scale represents classes of quantitative data values and the vertical scale represents frequencies. The heights of the bars correspond to the frequency values. – Strengths: great tool for judging shape with moderate or large sample sizes – Weaknesses: with a small sample size, the histogram may not “fill in” well to see the shape. • Relative frequency - the proportion or percentage of the count in a category relative to the total number of items in all categories. relative freq. = class frequency sum of all frequencies • Cumulative frequency distribution - the sum of the frequencies for that class and all previous classes. 7 Example 4. The table below lists frequencies of the amount of cash each student had in his or her pocket. Use the frequency distribution table to construct the relative frequency distribution and the cumulative frequency distribution. Money in $ Frequency Relative Freq. Cumulative Freq. 0-4 5 5-9 3 10-14 8

15-19 6 20-24 10 25-29 7 Example 5. A car salesman records the number of cars he sold each week for the past year. The following frequency histogram shows the results. (a) What is the most frequent number of cars sold in a week? (b) For how many weeks were two cars sold? (c) Determine the percentage of time two cars were sold. (d) Describe the shape of the distribution. 8 Example 6. An experiment was conducted in which two fair dice were thrown 100 times. The sum of the pips showing on the dice was then recorded. The following frequency histogram gives the results. (a) What was the most frequent outcome of the experiment? (b) What was the least frequent? (c) How many times did we observe a 7? (d) How many more 5’s were observed than 4’s? (e) Determine the percentage of time a 7 was observed. (f) Describe the shape of the distribution. 9 Example 7. Subjects in a psychological study were timed while completing a certain task.

The times below are in seconds. 7.6, 8.1, 9.2, 6.8, 5.9, 6.2, 6.1, 5.8, 7.3, 8.1, 8.8, 7.4, 7.7, 8.2 (a) Complete a stem-and-leaf plot for the following list of times. (b) According to the stemplot, how many subjects completed the task in the 5second range? (c) How many subjects took longer then 7.0 seconds? (d) How would you describe the shape of this distribution? 10 Notes: • A histogram is not the same as a bar graph. A histogram displays the distribution of a quantitative variable, whereas a bar graph displays the distribution of a categorical variable. • Histograms are much easier to construct with technology than by hand. • It’s easy to confuse skewness to the right and to the left. Remember the direction of the skew is indicated by the longer tail. • Pay attention to what measurement units are represented in a stemplot. Include a scale to remind the reader what the units represent (9|2 = 9.2 seconds, or leaf unit = tenths of a second)....


Similar Free PDFs