Research Paper PDF

Title Research Paper
Course Statistics for Business
Institution National University (US)
Pages 20
File Size 435.6 KB
File Type PDF
Total Downloads 15
Total Views 208

Summary

Paper on Statistics Course...


Description

Statistics for Business Introduction The research paper is about our class MNS601, Statistics for Business. Over the four weeks of class, we have covered multiple topics related to statistics. We have learned calculations of statistical problems and also learned how to define our calculations in words. The course has covered topics from the book, Statistics for Business and Economics, and for further and better understanding I have used our professor’s lecture notes and different websites to conduct this research. The research covers all the topics, calculations, definitions, etc. learned in the class during these four weeks. A total of 10 class learning objectives (CLO) were assigned to this course and in this research, I have attempted to cover all the CLOs and their respective topics. The research contains explanation of theories according to the professor, book, and online sources. I have also provided tables and calculations where appropriate to elaborate the understanding of the concepts and calculations, and their application. The research is categorized weekly, and is divided into 4 parts as there were 4 weeks. Each week consisted of at least 2 CLOs and various subtopics.

1

Statistics for Business Week 1 There were three Class Learning Objectives (CLO) for week 1 and they are as follows: CLO 1 – Review necessary math and Excel operations. Develop an understanding of data and statistics. Apply the concepts of data, variables, statistics and parameters. CLO 2 – Communicate statistical data in appropriate forms. Examine the role and uses of tables and graphs. CLO 3 – Apply the fundamentals of descriptive statistics, including mean, standard deviation and sample size. Introduction to probability in statistics; most common probability distributions used in business decisions. To cover the three learning objectives of this week, we have learned new terms and basic statistical distributions. Starting with Data, it is explained by our professor as a set of values or observations that has no meaning itself. According to the Statistics for Business and Economics book, Data are the facts and figures collected, analyzed and summarized for presentation and interpretation. For example, we take a set of random data from National University, number of students in a class from a random sample of 5 classes. Each class will have different numbers of students registered and that is data. The next thing we have learned was Variable. According to our professor, a variable is something that is not static and it changes. According to the book, a variable is a characteristic of interest for the elements. The world is better off in data but poor off in information. Then we have learned about Parameter, that is something that tells us something more meaningful about a population. The next topic we covered was about Descriptive Statistics and Inferential Statistics. Descriptive statistics is the science of describing the important aspect over set of measurement. 2

Statistics for Business Inferential statistics, on the other hand, use a random sample of data taken from a population to describe and make inferences about the population. Inferential statistics are valuable when it is not convenient or possible to examine each member of an entire population. Within the inferential statistics, we have learned about mean, median, and mode. Mean is average of the data. Median is the middle number when the data is arranged chronologically. Mode is the number that has the highest frequency, that is, a number that is repeated the most in a given set of data. Standard deviation is a term used to define how spread out a data is. It represents deviations from the mean. It is denoted by σ. A Variance is the measure of variability that utilizes all the data. The variance is based on the difference between the value of each observation and the mean. It is denoted by σ2. We square the standard deviation to get the variance because we want absolute value or positive value. Based on what we have learned so far, let us look at the following table to represent the different terms and their calculations. Data

Mean

X 7 12 9 7 15

10 10 10 10 10 Total

Difference

Difference Squared

Variance (σ2)

Standard Deviation (σ)

X-

(X- )2

∑(X- )2/n

√σ2

9.6

3.10

-3 2 -1 -3 5

9 4 1 9 25 48

Then we have covered Levels of measurement that is putting things into category. The different levels are Nominal – no sequence, Ordinal – in order, Interval – based on characteristic, and Ratio – based on weight. Then we have covered Measure of Central Tendency which is descriptive statistics as we discussed before. In Statistics, measure of central tendency is a central 3

Statistics for Business or typical value for a distribution. It is usually measured by mean, median, and mode of the distribution. Then we have covered frequency distribution. It is a summary of the data. It is another way of presenting the data to make it more meaningful. For example, we can use table to represent a set of data like the one above. We can also represent data in the form of a linear graph, Pie Chart, Bar Chart, and Histogram. Bar chart is used for continuous data whereas Histogram is used for discreet data. For Inferential Statistics, we have covered Probability where we make assumptions about the data distribution, for example, the likelihood of what is going to happen. The total probability must always equal to 1. Within the probability concept, we have learned about Binomial Probability which has certain criteria and they are, it must follow a sequence that is number of trials should be defined, it must have only two outcomes like a coin toss with only heads or tails, and its trials must be independent that is one trial must not influence other trials. Week 2 There were two Class Learning Objectives (CLO) for week 2 and they are as follows: CLO 4 – Analyze data to determine the role of probability and various probability distributions. Identify and study discrete probability distributions. Examine and use of continuous probability distributions CLO 5 – Evaluate sampling distributions and their role in confidence intervals. Develop the Concepts of Sampling and Sampling Distributions. Understand the Ideas of Interval Estimation. Define the Fundamentals of Statistics in Quality Control.

4

Statistics for Business In the second week we started with the Central Limit Theorem. The central limit theorem states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed, regardless of the underlying distribution. A simple example of the central limit theorem is rolling a large number of identical, unbiased dice. The distribution of the sum (or average) of the rolled numbers will be well approximated by a normal distribution. We then learned about discrete probability, which the probability of an occurrence of a random variable, X, that takes various values. The probability function for rolling a die is f(x) = 1/n. The variance of the variable X, given probability distribution, is calculated by ∑(X- )2 f(x). When the data is continuous, it is distributed in the normal distribution, bell shaped curve. The data is symmetrical to the mean. The probability function now is f(x) = 1/(b-a), where a ≤ x ≤ b is the interval. Bell-shaped curve is the most common normal curve used to represent data distribution. We then learned about the confidence interval. It is a type of interval estimate of a population parameter. To calculate the interval estimate we first subtract the rate of error from 1 and then use the following formula to determine the interval estimate,

± Za/2(σ/n), where Z-score

is the independent variable. Given the confidence level, we can use the normal distribution to calculate interval estimate. The confidence level indicates how confident we are with our claim. The next thing we learned was about Sampling. Starting with the basics, we learned some key notations to some specific values such as standard deviation, sample size, etc. The Sample standard deviation is denoted as σ , whereas the population standard deviation is denoted simply

5

Statistics for Business by σ. The sample size is denoted by n, and the population is denoted by N. To calculate the standard deviation of the sample we divide the population standard deviation by the square root of the sample size, σ = σ/√n. For the example shown above in the form of frequency distribution table, if we assume that the standard deviation of the population is 5, then the sample standard deviation would be 5/√5 = 2.24. The next thing we covered in the class was about Quality Control. First, we learned about the ISO certification. ISO is the acronym for International Organization for Standardization. ISO International Standards ensure that products and services are safe, reliable and of good quality. For business, they are strategic tools that reduce costs by minimizing waste and errors and increasing productivity. They help companies to access new markets, level the playing field for developing countries and facilitate free and fair global trade. ISO 9000 deals directly with Quality Management. The standards provide guidance and tools for companies and organizations who want to ensure that their products and services consistently meet customer’s requirements, and that quality is consistently improved. Standards in the ISO 9000 family include: 

ISO 9001:2015 - sets out the requirements of a quality management system



ISO 9000:2015 - covers the basic concepts and language



ISO 9004:2009 - focuses on how to make a quality management system more efficient and effective



ISO 19011:2011 - sets out guidance on internal and external audits of quality management systems. The next thing we learned about Quality control was the Sigma Six. Six Sigma is a

disciplined, data-driven approach and methodology for eliminating defects (driving toward six 6

Statistics for Business standard deviations between the mean and the nearest specification limit) in any process – from manufacturing to transactional and from product to service. To achieve Six Sigma, a process must not produce more than 3.4 defects per million opportunities. A Six Sigma defect is defined as anything outside of customer specifications. A Six Sigma opportunity is then the total quantity of chances for a defect. The third thing we covered from quality control was the control charts. According to the book, a control chart provides a basis for deciding whether the variation in the output is due to common causes (in control) or assignable causes (out of control). Whenever an out-of-control situation is detected, adjustments or other corrective action will be taken to bring the process back into control. There are two types of errors and they are controllable and uncontrollable. Controllable errors can be controlled and on the other hand uncontrollable errors cannot be controlled but can be avoided by being proactive. Another topic to look into was the Malcolm Baldrige National Quality Award. The Malcolm Baldrige National Quality Award is given by the president of the United States to organizations that apply and are judged to be outstanding in seven areas: leadership; strategic planning; customer and market focus; measurement, analysis, and knowledge management; human resource focus; process management; and business results. It is an award established by the U.S. Congress in 1987 to raise awareness of quality management and recognize U.S. companies that have implemented successful quality management systems. The last thing we covered in week 2 was the Permutation and Combination. This is to count the number of experimental outcomes when the experiment involves selecting n objects from a set of N objects. In mathematics, a combination is a way of selecting items from a collection, such that

7

Statistics for Business (unlike permutations) the order of selection does not matter. In Permutation, the order of selection matters. The formula used to calculate Combination is CNn = N!/n!(N-n)!. Week 3 There were two Class Learning Objectives (CLO) for week 3 and they are as follows: CLO 6 – Design single sample and two sample hypothesis tests. Perform Single Sample Hypothesis Testing. Perform Two Sample Hypothesis Testing. Learn How to Make Inferences About Variances. CLO 7 – Study correlation and simple linear regression. Apply the techniques relating to correlation, simple linear regression and multiple regression. Develop and utilize multiple regression. Study model building with regression analysis. The week 3 began with Leadership. We looked into the factors that influences decision making in terms of investment and a company being in a leadership position is one of them. According to the professor, it is better to invest in a mediocre product with excellent leadership rather than investing in top product with mediocre leadership. The week was the start of Hypothesis concept. Hypothesis is a claim that something is true. The claim that something is true is termed as the Null hypothesis, H0. On the other hand, anything other than the claim is also termed under hypothesis as Alternate hypothesis, H a. In Hypothesis testing, we analyze a sample of data to determine whether a hypothesis should or should not be rejected. We look at a value of a population parameter. The null hypothesis is the assumption about the population parameter. For example, let us assume that the average number of students in each class at National University is at least 10. Therefore, the hypothesis will be written as:

8

Statistics for Business H0: µ ≥ 10 Ha: µ ˂ 10 The null hypothesis is an assumption that is always challenged. Based on the set data, we have to determine whether the assumption in the null hypothesis is acceptable or should be rejected. There are two types of errors in hypothesis testing and they are Type I and Type II errors.

Accept H0 Reject H0

H0 True

Ha True

Correct Conclusion Type I error

Type II error Correct Conclusion

According to the table above we can see that Type I error occurs when our test’s outcome is that H0 is true but we have rejected the null hypothesis. On the other hand, Type II error occurs when our test’s outcome is that Ha is true but we have accepted the null hypothesis. A hypothesis can be one tail test or two tail test. In a one tail test, the null hypothesis points towards a direction like greater than or less than and so does the alternate hypothesis. In a two tail test, the null hypothesis takes a value whereas the alternate hypothesis takes all other values in both directions. For example, using the previous example with a slight change in the null hypothesis, that is in place of at least 10, we now write that the average number of students in each class is 10. This hypothesis is written as following: H0: µ = 10 Ha: µ ≠ 10 Then we have covered the standard error of the mean,

. This is similar to the standard

deviation of the sample and uses the same formula to calculate, σ

= σ/√n. Using the standard

9

Statistics for Business error is important as it allows us to calculate the test statistic, Z. The test statistic, Z, is calculated by, Z = ( - µ 0)/(σ ). The test statistic, Z, helps us determine, via area under the curve, that whether the null hypothesis should or should not be rejected. Once we find the Z-value, we refer to the Zdistribution table to find the value corresponding to the Z-value, that is the area under that curve. Comparing it with the confidence level and level of significance helps us determine the test results. The next topic we covered was inference about variance. The sample variance is the point estimator of the population variance, σ2. When a simple random sample is selected from a normal population, the sample follows a chi-square distribution with n-1 degrees of freedom. The Chisquare distribution refers to the statistical method assessing the goodness of fit between observed values and those expected theoretically. The number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary. The number of independent ways by which a dynamic system can move, without violating any constraint imposed on it, is called number of degrees of freedom. To use chi-square distribution, we need degrees of freedom and the confidence level to determine the area under the curve. The next thing we covered was the Correlation Coefficient. The correlation coefficient measures the level of relationship between two variables. In other words, correlation coefficient measures the strength and the direction of a linear relationship between two variables. To calculate the correlation coefficient, we need the Sample Covariance, Sxy, Sample Standard deviation of variable X, Sx, and Sample Standard deviation of variable Y, Sy. The formula to calculate the correlation coefficient, Rxy, is as follows:

10

Statistics for Business The sample covariance, Sxy, is an average of the product of the deviations of the variables X and Y data from their means. Thus, the physical unit of the sample covariance is the product of the units of X and Y. It is calculated using the following formula:

The sample standard deviation for variable X and Y are calculated as following:

The correlation coefficient ranges from -1 to +1. Values close to -1 or +1 indicate a strong linear relationship. The closer the correlation is to zero, the weaker the relationship. However, a high correlation between two variables does not mean that changes in one variable will cause changes in the other variable. A correlation of higher than zero indicates a positive correlation between the two variables which suggests that the two variables should move in the same direction proportionately. A correlation of less than zero indicates a negative correlation between the two variables which suggests that the two variables should move in the opposite direction proportionately. We then moved on to Regression Analysis. Regression analysis is a statistical process for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. Regression analysis helps one understand how the typical value of the dependent variable changes when any one of the independent variables is varied, while 11

Statistics for Business the other independent variables are held fixed. The first type of regression we covered is the simple linear regression. Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables. One variable, denoted x, is regarded as the predictor, explanatory, or independent variable, and the other variable, denoted y, is regarded as the response, outcome, or dependent variable. The equation for simple linear regression is,

= b0 + b1X. This is the least squares method of calculating simple linear regression.

To calculate b1 we need to use the following formula:

After the calculation of b1, we can calculate b0 by the formula

– b1 . Based on the given

data and our calculated values, we can derive any prediction regarding the two variables. The next method we learned is the Multiple Linear Regression. It tells us about how a dependent variable is related to 2 or more independent variables. The formula for multiple linear regression model is

= b0 + b1X1 + b2X2 + ... + bnXn. Th...


Similar Free PDFs