Stats test 1 cheat sheet PDF

Title Stats test 1 cheat sheet
Author Jacob Dowd
Course Advanced Statistical Methods for Business
Institution Texas State University
Pages 2
File Size 211.9 KB
File Type PDF
Total Downloads 52
Total Views 139

Summary

Used this cheat sheet to get an A on test....


Description

Getting started:Statistics-The science of gathering, describing, and analyzing data. OR The actual numerical descriptions of sample data. Populationstatistical study centers upon a particular group of interest.The population consists of all persons or things being studied. Data- Information, in particular, information prepared for a study. Census-When data is obtained from every member of the population. Parameter- The numerical description of a particular population characteristic. Sample-A subset of the population from which data are collected. Sample Statistic-The actual numerical descriptions of a particular sample characteristic. Population Vs. Sample: 1. Population:whole group, group I want to know about, characteristics are called parameters, parameters generally are unknown, parameter is fixed;2.sample:part of the group, group IDK about, characteristics are called statistics, statistics are generally known,statistics change w/the sample. Descriptive Statistics-This branch of statistics, as a science, gathers, sorts, summarizes, and displays the data. Inferential Statistics-This branch of statistics, as a science, involves using descriptive statistics to estimate population parameters. Level of measurement: Qualitative Data-Consist of labels or descriptions of traits of the sample, also known as categorical data. Examples of qualitative data include information such as favorite foods, hometown, eye color, or identification numbers. Generally, qualitative data will consist of words; however numbers can act as words, or placeholders. For example, numbers on a football jersey ser only to distinguish different players during a game. Quantitative Data-Consist of counts or measurements and therefore, are numerical. Examples o quantitative data are test scores, average rainfall, and median weights. Quantitative data can be manipulated in ways that qualitative data cannot. Also, if the measurement scale of the data has equal distances or intervals between the values, then the data are quantitativ e. Continuous Data-Dat that can take on any value within some interval;normally measurements like length/width. Discrete Data-Data that refer to individual numbers that are countable, such as 0, 1, and 2;example having 2 or 3 pets in a house,not 2.5. Qualitative data have a level of measurement that is either Nomina or Ordinal. Quantitative data have a level of measurement that is either Interval or Ratio. Nominal-Data that represent whether a variable possesses some characteristic; nominal data values are just numbers being used as names and nothing else. Ordinal-Data that represent categories that have some associated order; ordinal data possess order. Interval-Data that can be ordered and the arithmetic difference is meaningful. Ratio-Ratio data a similar to interval data, except that they have a meaningful zero point and the ratio of two data points is meaningful. Frequency & distributions: ordered array-an ordered list of the data from largest to smallest or vice versa, then the lowest price and highest price are easy to see. Distributiondisplays data values that occur and how frequently each value occurs. frequency distribution is a table that divides the data into groups, called classes, and shows how many data values occur in each group. frequency, f, of a group, or class, is the number of data values in that class. ungroupe frequency distribution- A frequency distribution where each class represents a single value. grouped frequency distribution- A frequency distributi where the classes are ranges of possible values. Steps for Constructing a Frequency Distribution: 1. Decide how many classes should be in the distribution. 2. Choose an appropriate class width. To find an appropriate class width, begin by subtracting the lowest number in the data set from t highest number in the data set and dividing the difference by the number of classes. Rounding this number up gives a good starting point from whic to choose the class width. You will want to choose a width so that the classes formed present a clear representation of the data, so make a sensible choice. Also note that the class width is the difference between lower limits of consecutive classes, which we will define in the next step. 3. Find the class limits. The lower class limit is the smallest number that can belong to a particular class, and the upper class limit is the largest number that ca belong to a class. Using the minimum data entry, or a smaller number, as the lower limit of the first class is a good place to begin. However, judgme is required. You should choose the first lower limit so that reasonable classes will be produced, and it should have the same number of decimals as t data set. After choosing the lower limit of the first class, add the class width to it to find the lower limit of the second class. Continue this pattern un the classes are complete. The upper limit of each class is determined such that the classes do not overlap. If, after creating your classes, there is dat that falls outside the class limits, you must adjust either the class width or the choice for the first lower class limit.4. Determine the frequency of eac class. Make a tally mark for each piece of data in the appropriate class. Count the marks to find the total frequency for each class. class boundaries split the difference in the gap between the upper limit of one class and the lower limit of the next class. To find a class boundary, add the upper limi of one class to the lower limit of the next class and divide by two. For example, if an upper class limit is 10, and the next lower class limit is 11, the class boundary would be 10.5. midpoint, or class mark, of a class is the sum of the lower and upper limits of the class divided by 2. relative frequenc is the percentage of the data set that falls into a particular class. It is calculated by dividing the class frequency by the sample size. The sample size, n for a frequency distribution can be found by adding all of the class frequencies together. cumulative frequency, is the sum of the frequency for a giv class and the frequencies of all previous classes. The cumulative frequency of the last class equals the sample size. Pie charts and bar graphs: pie chart- shows how large each category is in relation to the whole; it is created from a frequency distribution by using the relative frequencies. The siz or central angle, of each wedge in the pie chart is calculated by multiplying 360°by the relative frequency of each class and rounding to the nearest whole degree. Bar graphs- are used to represent categorical data. The height of the bar represents the amount of data in that category. The horizon axis contains the qualitative categories, and the vertical axis represents the frequency of each category. Because the bars represent categories, the width of each bar is meaningless, and the bars usually do not touch. However, to avoid misrepresenting the data, the bars should be of uniform wid Pareto chart- special type of bar graph with the bars in descending order; typically used with nominal data. side-by-side bar graph- bar graph that compares different groups. stacked bar graph- A bar graph that allows the reader to view different groups in a single category in order to make comparisons between the categories. Histograms, Polygons, Stem and Leaf Plots: line graph- is used to show specific trends in data, normally over time, that show how two variables are related to one another. To construct a line graph, the x-axis will represent the independent variable in the da given and the y-axis will represent the dependent variable. A point will mark where each x-value is associated with its corresponding y-value. A line will then be used to join the data points in order. frequency histogram- bar graph of a frequency distribution. To construct a histogram: 1. Find the class boundaries of the frequency distribution;2. Mark the class boundaries of every class on the horizontal axis, which is a real number line;3. The width of the bars represents the width of each class;4. The bars should touch since the upper class boundary of one class is the same as the lower class boundary of the next class; 5. The bars should be uniform in width; thus, histograms are only appropriate for frequency distributions that have classes of uniform width;6. The height of each bar represents the frequency of the class; thus, frequency is graphed on the vertical axis. relative frequency histogram- is identical to a regular histogram, except that the heights of the bars represent the relative frequencies of each class rather than simply the frequencies. It is appropriate to label a relative frequency histogram with either decimals or percentages. frequency polygon- is a visual display of the frequencies of each class using the midpoints from the histogram. Steps for Constructing a Frequency Polygon: 1. Mark the clas boundaries on the x-axis and the frequencies on the y-axis;2. Add the midpoints to the x-axis and plot a point at the frequency of each class directly above its midpoint;3.Join each point to the next with a line segment. Ogive- type of line graph which depicts the cumulative frequency of each class from a frequency table. Unlike creating a frequency polygon, we only include an extra class at the lower end for this graph, giving it a frequency of 0 Next plot a point at the cumulative frequency for each class directly above its upper class boundary The ogive is created by joining the points

together with line segments. stem-and-leaf plot-the leaves are usually the last digit in each data value and the stems are the remaining digits. Dot Plot-A graph which retains the original data by plotting a dot above each data value on a number line. Measures of Center: deviation-Given some point A and a data point x, then x−A represents how far x deviates from A. weighted mean- for a sample, first multiply each value by its respective weight. Then divide the sum of these products by the sum of the weights to obtain the mean. Unimodal- If only one value occurs the most, then the data set. Bimodal- If exactly two values occur equally often and more than all the others. Multimodal-If more than two values occur equally ofte

and more than all the others. Determining the Most Appropriate Measure of Center:1. For qualitative data, the mode should be used; For quantitative data the mean should be used, unless the data set contains outliers or is skewed;3. For quantitative data sets that are skewed or contain outliers, the median should be used. Measures of Dispersion: range-is the difference between the largest and the smallest data value...


Similar Free PDFs