statistical analysis with software application PDF

Title statistical analysis with software application
Author Sheila Monique Tubilan
Course Statistical Analysis
Institution University of San Carlos
Pages 126
File Size 13.7 MB
File Type PDF
Total Downloads 501
Total Views 555

Summary

MODULE 1:INTRODUCTION TO THESTATISTICALCONCEPTSObjectives:After successful completion of thismodule, you should be able to:- Define statistics.- Enumerate the importance andlimitations of statistics- Explain the process of statistics- Know the difference betweendescriptive and inferentialstatistics....


Description

MODULE 1:

DEFINITION OF STATISTICS

INTRODUCTION TO THE STATISTICAL CONCEPTS

Statistics plays a major role in many aspects of our lives. It is used in sports, for example, to help a general manager decide which player might be the best fit for a team. It is used in politics to help candidates understand how the public feels about various policies. And statistics is used in medicine to help determine the effectiveness of new drugs. Used a p p r o p r i a t e l y, s t a t i s t i c s c a n e n h a n c e o u r understanding of the world around us. Used inappropriately, it can lend support to inaccurate beliefs. Understanding statistical methods will provide you with the ability to analyze and critique studies and the opportunity to become an informed consumer of information. Understanding statistical methods will also enable you to distinguish solid analysis from bogus “facts.”

Objectives: After successful completion of this module, you should be able to: • Define statistics. • Enumerate the importance and limitations of statistics • Explain the process of statistics • Know the difference between descriptive and inferential statistics. • Distinguish between qualitative and quantitative variables. • Distinguish between discrete and continuous variables. • Determine the level of measurement of a variable.

Many people say that statistics is numbers. After all, we are bombarded by numbers that supposedly represent how we feel and who we are. Certainly, statistics has a lot to do with numbers, but this definition is only partially correct. Statistics is also about where the numbers come from (that is, how they were obtained) and how closely the numbers reflect reality. Statistics is the science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions. In addition, statistics is about providing a measure of confidence in any conclusions. Let’s break this definition into four parts. The first part states that statistics involves the collection of information. The second refers to the organization and summarization of information. The third states that the information is analyzed to draw conclusions or answer specific questions. The fourth part states that results should be reported using some measure that represents how convinced we are that our conclusions reflect reality.

• Statistics is important because it enables people to make decisions based on empirical evidence. • Statistics provides us with tools needed to co n ve r t m a ssive d a ta in to p e r tin e n t information that can be used in decision making. • Statistics can provide us information that we can use to make sensible decisions. What information is referred to in the definition? The information referred to the definition is the data. According to the Merriam Webster dictionary, data are “factual information used as a basis for reasoning, discussion, or calculation”. Data can be numerical, as in height, or nonnumerical, as in gender. In either case, data describe characteristics of an individual. Field of Statistics A. Mathematical Statistics- The study and development of statistical theory and methods in the abstract. B. Applied Statistics- The application of statistical methods to solve real problems involving randomly generated data and the development of new statistical methodology motivated by real problems. Example branches of Applied Statistics: psychometric, econometrics, and biostatistics. Limitation of Statistics Statistics is not suitable to the study of qualitative phenomenon. 2. Statistics does not study individuals. 3. Statistical laws are not exact.

4. Statistics table may be misused. 5. Statistics is only, one of the methods of studying a problem. Definitions:

• Universe is the set of all entities under study.

• A Population is the total or entire group of individuals or observations from which information is desired by a researcher. Apart from persons, a population may consist of mosquitoes, villages, institution, etc.

• An individual is a person or object that is a member of the population being studied.

• A statistic is a numerical summary of a sample.

• Sample is the subset of the population. • Descriptive statistics consist of organizing and summarizing data. Descriptive statistics describe data through numerical summaries, tables, and graphs.

• Inferential statistics uses methods that take a result from a sample, extend it to the population, and measure the reliability of the result.

• A parameter is a numerical summary of a population Example: Consider the Scenario. You are walking down the street and notice that a person walking in front of you drops PHP100. Nobody seems to notice the PHP100 except you. Since you could keep the money without anyone knowing, would you keep the money or return it to the owner?

Suppose you wanted to use this scenario as a gauge of the morality of students at your school by determining the percent of students who would return the money. How might you do this? You could attempt to present the scenario to every student at the school, but this would be difficult or impossible if the student body is large. A second possibility is to present the scenario to 50 students and use the results to make a statement about all the students at the school.

account for the variability in our results. One goal of inferential statistics is to use statistics to estimate parameters.

In the PHP100 study presented, the population is all the students at the school. Each student is an individual. The sample is the 50 students selected to participate in the study.

2. Collect the information needed to answer the questions.

Suppose 39 of the 50 students stated that they would return the money to the owner. We could present this result by saying that the percent of students in the survey who would return the money to the owner is 78%. This is an example of a descriptive statistic because it describes the results of the sample without making any general conclusions about the population. So 78% is a statistic because it is a numerical summary based on a sample. Descriptive statistics make it easier to get an overview of what the data are telling us. If we extend the results of our sample to the population, we are performing inferential statistics. The generali zation c ontains uncertainty because a sample cannot tell us everything about a population. Therefore, inferential statistics includes a level of confidence in the results. So rather than saying that 78% of all students would return the money, we might say that we are 95% confident that between 74% and 82% of all students would return the money. Notice how this inferential statement includes a level of confidence (measure of reliability) in our results. It also includes a range of values to

PROCESS OF STATISTICS 1. Identify the research objective. A researcher must determine the question(s) he or she wants answered. The question(s) must clearly identify the population that is to be studied. Identify the research objective.

Conducting research on an entire population is often difficult and expensive, so we typically look at a sample. This step is vital to the statistical process, because if the data are not collected correctly, the conclusions drawn are meaningless. Do not overlook the importance of appropriate data collection. Example: A research objective is presented. For each research objective, identify the population and sample in the study. 1. The Philippine Mental Health Associations contacts 1,028 teenagers who are 13 to 17 years of age and live in Antipolo City and asked whether or not they had been prescribed medications for any mental disorders, such as depression or anxiety. Population: Teenagers 13 to 17 years of age who live in Antipolo City Sample: 1,028 teenagers 13 to 17 years of age who live in Antipolo City

1. A farmer wanted to learn about the weight of his soybean crop. He randomly sampled 100 plants and weighted the soybeans on each plant. Population: Entire soybean crop Sample: 100 selected soybean crop 3. Organize and summarize the information. Descriptive statistics allow the researcher to obtain an overview of the data and can help determine the type of statistical methods the researcher should use. 4. Draw conclusion from the information. In this step the information collected from the sample is generalized to the population. Inferential statistics uses methods that takes results obtained from a sample, extends them to the population, and measures the reliability of the result. Take Note! If the entire population is studied, then inferential statistics is not necessary, because descriptive statistics will provide all the information that we need regarding the population. Example: For the following statements, decide whether it belongs to the field of descriptive statistics or inferential statistics. 1. A badminton player wants to know his average score for the past 10 games. (Descriptive Statistics) 2. A car manufacturer wishes to estimate the average lifetime of batteries by testing a

sample of 50 batteries. (Inferential Statistics) 3. Janine wants to determine the variability of her six exam scores in Algebra. (Descriptive Statistics) 4. A shipping company wishes to estimate the number of passengers traveling via their ships next year using their data on the number of passengers in the past three years. (Inferential Statistics) 5. A politician wants to determine the total number of votes his rival obtained in the past election based on his copies of the tally sheet of electoral returns. (Descriptive Statistics) DISTINCTION BETWEEN QUALITATIVE AND QUANTITATIVE VARIABLES Variables are the characteristics of the individuals within the population. For example, recently my mother and I planted a tomato plant in our backyard. We collected information about the tomatoes harvested from the plant. The individuals we studied were the tomatoes. The variable that interested us was the weight of a tomato.My mom noted that the tomatoes had different weights even though they came from the same plant. She discovered that variables such as weight may vary. If variables did not vary, they would be constants, and statistical inference would not be necessary. Think about it this way: If each tomato had the same weight, then knowing the weight of one tomato would allow us to determine the weights of all tomatoes. However, the weights of the tomatoes vary. One goal of research is to learn the causes of the variability so that we can learn to grow plants that yield the best tomatoes.

It is helpful to divide variables into different types, as different statistical methods are applicable to each. The main division is into qualitative (or categorical) or quantitative (or numerical variables). Variables can be classified into two groups: 1. Qualitative variables (Categorical) is variable that yields categorical responses. It is a word or a code that represents a class or category. 2. Quantitative variables (Numeric) takes on numerical values representing an amount or quantity.

possible values. If you count to get the value of a quantitative variable, it is discrete. 2. A continuous variable is a quantitative variable that has an infinite number of possible values that are not countable. If you measure to get the value of a quantitative variable, it is continuous. Example: Determine whether the following quantitative variables are discrete or continuous. 1. The number of heads obtained after flipping a coin five times. (Discrete)

Example: Determine whether the following variables are qualitative or quantitative. 1. Haircolor (Qualitative) 2. Temperature (Quantitative) 3. Stages of breast cancer (Qualitative) 4. Number of hamburger sold (Quantitative) 5. Number of children (Quantitative) 6. Zip code (Qualitative)

2. The number of cars that arrive at a McDonald’s drive-through between 12:00 P.M and 1:00 P.M. (Discrete) 3. The distance of a 2005 Toyota Prius can travel in city conditions with a full tank of gas. (Continuous) 4. Number of words correctly spelled. (Discrete) 5. Time of a runner to finish one lap. (Continuous) LEVELS OF MEASUREMENT

7. Place of birth (Qualitative) 8. Degree of pain (Qualitative) DISTINCTION BETWEEN DISCRETE AND CONTINUOUS Quantitative variables may be further classified into: 1. A discrete variable is a quantitative variable that either a finite number of possible values or a countable number of

Levels of Measurement

It is important to know which type of scale is represented by your data since different statistics are appropriate for different scales of measurement. A characteristic may be measured using nominal, ordinal, interval and ration scales. 1. Nominal Level - They are sometimes called categorical scales or categorical data. Such a scale classifies persons or objects into two or more categories. Whatever the basis for classification, a person can only be in one category, and members of a given category have a common set of characteristics.

3. Interval Level - This is a measurement level not only classifies and orders the measurements, but it also specifies that the

. A . Arithmetic operations such as addition and subtraction can be performed on values of the variable. Example:

- Te m p e r a t u r e o n F a h r e n h e i t / C e l s i u s Thermometer

- Trait anxiety (e.g., high anxious vs. low anxious)

Example:

- Method of payment (cash, check, debit card, credit card)

- Type of school (public vs. private) - Eye Color (Blue, Green, Brown) 2. Ordinal Level - This involves data that may be arranged in some order, but differences between data values either cannot be determined or meaningless. An ordinal scale not only classifies subjects but also ranks them in terms of the degree to which they possess a characteristics of interest. In other words, an ordinal scale puts the subjects in order from highest to lowest, from most to least. Although ordinal scales indicate that some subjects are higher, or lower than others, they do not indicate how much higher or how much better. Example:

- Food Preferences - Stage of Disease - Social Economic Class (First, Middle, Lower) - Severity of Pain

- IQ (e.g., high IQ vs. average IQ vs. low IQ) 4. Ratio Level - A ratio scale represents the highest, most precise, level of measurement. It has the properties of the interval level of measurement and the ratios of the values of the variable have meaning. A Arithmetic operations such as multiplication and division can be performed on the values of the variable. Example:

- Height and weight - Time - Time until death Operations that make sense for variables of different scales.

Both interval and ratio data involve measurement. Most data analysis techniques that apply to ratio data also apply to interval data..Therefore, in most practical aspects, these types of data (interval and ratio) are grouped under metric data. In some other instances, these type of data are also known a s n u m e r i c a l di s c r e t e an d n u m e r ic a l continuous. Example: Categorize each of the following as nominal, ordinal, interval or ratio measurement. 1. Rankin g of colle ge at hletic teams. (Ordinal) 2. Employee number. (Nominal) 3. Number of vehicles registered. (Ratio) 4. Brands of soft drinks. (Nominal) 5. Number of car passers along C5 on a given day. (Ratio) 6. Zip code (Nominal) 7. Degree of pain (Ordinal) ACTIVITIES/ASSESSMENTS: Read each item carefully. Write the answer on the yellow paper. Answers Only. I.

A research objective is presented. For each, identify the (A) population and (B) sample in the study.

8. A polling organization contacts 2141 male university graduates who have a whitecollar job and asks whether or not they had received a raise at work during the past 4 months. A. ______________________________

B. ______________________________ 2. Every year the PSA releases the Current Population Report based on a survey of 50,000 households. The goal of this report is to learn the demographic characteristics, such as income, of all households within the Philippines. A. ______________________________ B. ______________________________ 3. Researchers want to determine whether or not higher folate intake is associated with a lower risk of hypertension (high blood pressure) in women (27 to 44 years of age). To make this determination, they look at 7373 cases of hypertension in these women and find that those who consume at least 1000 micrograms per day of total folate had a decreased risk of hypertension compared with those who consume less than 200. A. ______________________________ B. ______________________________ II. Indicate whether the following statements require the use of descriptive or inferential statistics. ______________1. A teacher wants to know the attitudes of all students towards abortion. ______________2. A market analyst of a sales firm draws a chart showing the sales figures of a given product for the period 2006-2007. ______________3. A forecaster predicts the results of an election using the number of votes cast in 15 out of 25 barangays. ______________4. Men are better in math than women.

_____________5. Forty percent of the employees of an organization were recorded tardy for at least 15 working days.

______________10. Brands of soft drinks

______________6. There are very few gender-related occupations.

______________12. Status Employment

____________ 7. An account predicts accuracy rate of a client’s financial resources. ______________ 8. A quality control manager wishes to check production output. ______________ 9. Records indicated that 75% of the faculty in the graduate school are doctoral degree holders. ______________ 10. There is no relationship between educational qualification of parents and academic achievement of their children.

______________11. Socioeconomic status

______________13. Number of missing teeth ______________14. Number of vehicles registered ______________15. Jersey Number ______________16. Number of employees collecting retirement benefits from GSIS ______________17. Duration of a seizure ______________18. Cause of death ______________19. Dividends

III. Identify the qualitative and quantitative variables and indicate the highest level of measurement required in each. If quantitative, classify whether discrete or continuous. ______________1. Occupation

______________20. Current assets list ______________21. Number of heart attacks ______________22. Account receivable! ______________23. Clothing size

______________2. Number of government officials

______________24. Blood type

______________3. Favorite color

______________25. Ethnic group

______________4. Temperature in Celsius degrees

REFERENCES:

______________5. Type of school

Statistics. Informed Decision using Data by Michael Sullivan, III,. Fifth Edition

______________6. Volume of mineral water sold daily

Sampling: Design and Analysis by Sharon L. Lhr. Second Edition

______________7. Employee number ______________8. Civil status ______________9. Equity accounts

MODULE 2:

DATA COLLECTION AND BASIC Concepts in Sampling DESIGN Objectives: After successful completion of this module, you should be able to: • Determine the sources of data (primary and secondary data). • Distinguish the different methods data collection under primary and secondary data. • Determine the appropriate sample size. • Differentiate various sampling techniques. • Know the sources of errors in sampling.

DATA COLLECTION

. It is a common practice that people receive large quantities of information everyday through conversations, televisions, computers, t...


Similar Free PDFs