Business Statistics FULL NOTES PDF

Title Business Statistics FULL NOTES
Author Sarah Thorne
Course Business Statistics
Institution University of Technology Sydney
Pages 56
File Size 5.5 MB
File Type PDF
Total Downloads 703
Total Views 981

Summary

Business StatisticsLECTURE 11. Describing DataData Types 1. Central Tendency 2. Variability 3. ShapeQualitative/Categorical: Categories with mutually exclusive labels. If labeled with numbers, numbers have no mathematical meaning. e Apple, Samsung, SonyNominal: (Sony, Apple, Samsung: 1 – Sony, 1 - A...


Description

Business Statistics LECTURE 1 1. Describing Data Data Types 1. Central Tendency 2. Variability 3. Shape Qualitative/Categorical: Categories with mutually exclusive labels. If labeled with numbers, numbers have no mathematical meaning. e.g Apple, Samsung, Sony Nominal: (Sony, Apple, Samsung: 1 – Sony, 1 - Apple, 3 – Samsung). - Ordering or ranking makes no sense

-

Numerical labels are arbitrary

Ordinal: (Dissatisfied, Neutral, Satisfied: 1 – Dissatisfied, 0 – Neutral, 1 – Satisfied) - Ordering or ranking has meaning or can be interpreted - Numerical labels respect the ordering Quantitative/ Numerical: Numbers are used to record certain events. Numbers have mathematical meaning Interval: (Temperature) - Quantity in difference is meaningful, but ratios not o 30*C is 15*C higher than 30*C but not two times warmer than 15*C - Zero has no natural meaning, hard to interpret o 0*C does not mean there is no heat Ratio: (Income in dollars, Height, Waiting Time in minutes) - Besides differences, ratio of two quantites is also meaningful o Lucy earns twice as much as her husband - Zero is meaningful o The waiting time in the cinema is zero. You do not need to wait.

2. Working with categorical data For categorical data, it is intuitive to tabulate (put into a table) and visualize - The technique is to use frequency distribution - Frequency counts : the total number of occurrences for each category - Relative frequency: the fraction or proportion of the total number of data items belonging to the category. - Percent Frequency: Relative frequency x 100% Eg Marital Status of Home Loan Applicants

3. Intermezzo: The language Random Variable: A variable (people’s height, height of females aged between 25 and 40) whose value are uncertain before you collect them. Population: The complete pool of a certain random variable (all humans’ heights: the heights of all female aged between 25 and 40 on Earth) Sample: A Random collection of certain size from the population (the heights of 50 randomly chosen people, the heights of 50 randomly chosen people aged between 25 and 40)>

Probability distribution: The general shape of probably for values that a random variable may assume e.g. Number of children a household has. Random variables are usually denoted by X, Y … Capital Letters - X : The number of children in a household - Y : The amount of time spent by husband on housework per day Realisations/observations of a RV are denoted by x, y …. Lowercase letters with subscript 1 E (1,2…N) or I e ( 1, 2…n) reads I is the set of one, two until N: I is the set of one, until n) - X1: number of children in household - Y137: amount of time spent by husband 137 on household per day N and n denote the size or number of observations. Typically - N is referred to the population size - n denotes the sample size, i.e. the number of data points we collect in a sample

4. Descriptive statistic: Central tendency Definition: Measures of central tendency yield information about the centre of the distribution of a R. They give us some idea what a typical, middle or average value that a RV can take. They are sometimes called measures of location. Three measures of central tendency 1. Mean: arithmetic average value 2. Mode: The most commonly occurring value 3. Median: middle value in an ordered array Mean:

2

5. Descriptive statistic: Variability Definition: Measures of variability yield information about the likelihood of a realization of the RV is away from the centre of its distribution. They give us some idea of fluctuatuion and volatility across realisations of the RV. They are sometimes called measures of scale, spread, dispersion or risk. Three common used measures of variability: 1. Variance (Var): average of squared distance from the mean 2. Standard deviation (STD): square root of variance 3. Coefficient of variation: std/mean x 100%

6. Descriptive statistic: Shape Central tendency and variability are useful to describe and summarise data, or the distribution of RV’s. They cannot summarise asymmetry. Skewness is a measure of asymmetry.

3

LECTURE 2 1. Probability Theory A prerequisite to statistics is probability theory. We need to know what an event means and how we assign probability. Event: A set of outcomes (can contain no outcome, single outcome or multiple outcomes) of an experiment to which probability is assigned. Sample space: The set of all possible outcomes. So an event is a subset of the sample space. E.g. RV is the phone plan chosen by a client during weekdays or weekends

With observed outcomes, there are two methods of assigning probability. Classical: Every outcome is assigned the same probability (zero knowledge leads to naïve guess): P (outcome1) = … = P(outcome n) – 1/n

4

Relative Frequency: Outcomes receive probability corresponding to their number of occurrences (observation updates naïve guess): P (outcome I ) = number of occurrences i/ total number of occurrences of all outcomes

2. Law of Addition

Mathematically, we can define an event by A, and define the complement of the event by A1 (pronounced as A Prime) which means ‘’not A’’. Complement Rule of Probability:

When referring to joint probability, we use intersection . The event (It reads: the intersection of A and B, or A intersection B) means the event where both A and B are true, or both A and B occur. -

Let A denote the event ‘’$29’’, and B denote ‘’weekdays’’. A’ then indicates ‘’not $29’’, which in the previous example is ‘’$49 or $79’’ then indicates ‘’$29 and weekdays’’

Law of Total Probability, Version 1:

General rule of addition:

If event A occurs only if event B does not occur (cannot occur at the same time), we say A and B are mutually exclusive (events).

5

Mutually exclusive events:

In Venn diagram, these events do not intersect. Apparently, any event and its complement are mutually exclusive. Either ‘’A occurs’’ or ‘’A does not occur’’. So

ALWAYS

If event the occurrence of A and B covers the whole sample space, we say A and B are collectively exhaustive. Collectively exhaustive events:

Apparently, any event and its complement are collectively exhaustive. ‘’A occurs’’ and ‘’A does not occur’’ make up all possible outcomes. So

ALWAYS

3. Conditional probability In many cases, we are interested in conditional probability, e.g. What is the probability of achieving growth in the next quarter conditional on the success of our advertisement campaign? Denotes the probability that event A occurs, conditional on that B occurs. In terms of categorical random variables, we can also denote the r.v. taking value y.

i.e. the probability of RV X taking value x, conditional on

In our example of costs for phone plan, P ($29|weekdays) or P (Price = $29|Day = Weekdays) means the probability of a client choosing the $29 plan, conditional on she visits the store during weekdays. Conditional probability can be computed using the Bayes Rules:

So the prob of A conditional b equals the joint prob of both A and B, divided (weighted) by the prob of B which is marginal prob. Similarly, the prob of B conditional A equals the joint prob of both A and B, but divided (weighted) by the (marginal) prob of A Based on Bayes rule, we also have

So Joint prob equals conditional prob multiplied by marginal prob. This leads to

6

The difference between joint probability and conditional probability is the former is when A and B both occur at the same time; whereas the latter is when A occurs conditional on B occurs, as A and B occur but the event B occurs a priori. The fact that B occurs is known in the event A|B leads us to weight (divide) If A and B are independent (events), whether or not B occurs should not affect the probability that A occurs: also, whether or not A occurs should affect the probability that B occurs. This means:

Based on Bayes rule version 2, this also means:

The above relation is ‘’if and only if’’, which means:

Example: are the events ‘’client choosing $29 plan’’ and ‘’client purchasing during weekdays’’ independent?

We have P($29) = 0.0625, and P(weekdays) = 0.475, so P ($29) X P (weekdays) = 0.0297, which is difference from P($29 weekdays) = 0.0125. So these two events may not be independent. 4. Probability trees and Binomial probability Probability trees

7

Having learnt conditional probability, we can draw a probability tree to do scenario analysis. Let A, B denote stock price (p) movement (+1, 0, -1) in day 1 and day 2 respectively

A special case is when events come from only two outcomes (e.g. success or failure, or binary outcomes) and are independent so P (A|B). In such a case, the probability tree is called a binomial tree. Suppose we have five products, each can be defect (D) with probability p or functional (F) with probability q = 1 – p

Binomial distribution:

8

A RV X taking value in (0,1….n) is said to follow the binomial distribution, denoted by X ~ Bin (n, p) If it describes the (random) number of success out of n trials in a binomial experiment (meaning that successes in different trials are independent). Binominal distribution has the following probability distribution function (pdf) which calculates the prob of the r.v. equaling a certain number.

Properties of binomial distributions: - Almost all distributions have expectation (ie. Mean) and variance (so also standard deviation). We learn some other important distributions in week 3 and 4 - Every distribution (their pdf) is characterised by some parameters 1. The binomial distribution has two parameters, n (the number of trials) and p (the success probability or success rate) 2. The mean (or expectation) and variance of X ~ Bin (n,p) are given by:

9

LECTURE 3 1. Discrete probability distributions Discrete probability distribution: The distribution of a discrete random variable Discrete random variable: A RV that takes discrete values. Discrete RV counts - Whole number e.g 1, 2, 3 Continuous random variable: A RV that takes values on (part of) the real line. Continuous RV measures. - Does not have to be a whole number 1, 1.5, 2, 2.5

10

Summarion notion:

2. Probability Distribution Function Discrete probability distribution can be defined via the probability distribution function (AKA PDF) - This function assigns a probability within 0 and 1 to possible outcomes - All these probabilities (possibility of each event occurring) sum up to 1

-

In the above probability distribution function (in the top part), it states that the probability of event Xi is in-between 0 and 1 The bottom part states that all these possible outcomes and their respective possibilities should sum up to 1 o Example – If I flip a coin, there are two events that may occur. Heads = 0.5 possibility and tails also = 0.5 possibility. The addition of all these possibilities equates to 1

Based off the probability distribution function (PDF), we can compute the - Expectation - Variance - And standard deviation

11

Computing the expectation (mean):

-

In English = The population mean = the population mean = the addition of all RV probabilities The addition of all the multiplications of ‘xi’ with the probability of xi For example, if x2 has a probability of 0.6, then you plus (2 x 0.6) to any other variables such as x3, x4, x5 etc

Computing the Variance

-

The population variance = the population variance = the multiplication of the (the value of xi minus the mean squared) and the probability of the variable The (differences from the mean to X)^2X the variables probability

Computing the Standard Deviation

-

Note: The standard deviation is nothing but the square root of the variance. (thus the big square root sign above the variance formula)

3. Discrete probability distribution: Introduction Discrete probability distribution can be defined via the means of probability distribution function (pdf) which assigns a probability within (0,1) to possible outcomes such that all probabilities sum up to 1

-

12

Xi = The variables, a patient can see a doctor between 0 and 5 times per month P(X=Xi) = The probability of each variable

-

Xi P(X=Xi) = The expectation or mean, we get this by multiplying the probability by the variable and then adding all the variables together. In the above example, the mean or expectation is 1.25 visits per month.

-

= THE VARIANCE To get variance, we multiply the squared distance of the variable away from the mean to its relative probability, and we sum them up This square rooted is the standard deviation

Using the Probability Distribution Function, we can define the Cumulative Distribution Function (CDF) 4. Cumulative Distribution Function The cumulative distribution function calculates the probability that the random variable is smaller than or equal to a certain value. - This is denoted as:

The above states, the probability of the RV being smaller than or equal to a certain value = the addition of all the probabilities that the RV can be equal to or smaller than the certain value.

-

13

The above example demonstrates how if we are trying to find the probability of a variable being lower than a certain point, we add all the probabilities of each lower variable to find the probability The right table shows a running count of the probability of a RV being lower than or equal to each point. E.g. The probability of a RV being below or equal to 3 is 0.9

5. Special distributions Sometimes it is not possible to tabulate PDF or CDF. So instead, rather than tabulating (putting results into a table) the probability distribution function and cumulative probability distribution, we can mathematically model some special discrete distributions. -

For this subject, we will learn the following special discrete distributions o Binominal Distributions o Bernoulli distribution o Discrete uniform distribution o Poisson distribution

Binominal Distribution - Binomial distribution is regarding the number of successes vs the number of events - Binomial distribution is denoted as "x- bin(n,p)" whereas x follows a binomial distribution of number of trials vs probability of success). - To Calculate probability of binomial distribution, review week 2 content. - To calculate the expectation (mean) we multiple n by P as seen below

-

To Calculate the variance, we multiply n by p by (1-p)

As always, the standard deviation is the square root of the variance. To calculate the coefficient of the variation (CV) we divide the standard deviation by the mean or expectation. Then multiplied by 100%. (WEEK 1 NOTES) - To calculate CDF, we just use the same method as above. Using excel is highly recommended. Or we can use an old-fashioned binomial table. EXAMPLE -

If n = 10, p = 0.5 and the question asks what is the probability of having 4 successes, p(x=4), you line the table up and see the probability would be 0.2 Bernoulli Distribution - is A r.v. � taking value in either 0 OR 1 is said to follow the Bernoulli distribution, denoted by � ∼ ���(�) - The Bernoulli distribution describes a binary outcome with an outcome of either 0, meaning failure or 1 meaning success.

14

-

The probabilities of possible outcomes add up to one. The probability distribution function of Bernoulli distribution is;

-

The only parameter of this function is 'p', being the probability of success. The probability of success is always between 0 and 1.

-

Bernoulli distribution is a special case of binomial distribution where the number of trials is only 1. (n = 1)

-

It can also be denoted using the binomial distribution of x – Bin (1, p)

-

Because of this, the expectation or mean result, E(X), = P(of success), Variance, Var(X) = p (1 – p)

Example: Consider rolling a die only once. Let the r.v. � indicate whether 6 appears. It follows � ∼ ���(1/6). BASICALLY, A BERNOULLI DISTRIBUTION IS JUST A BINOMIAL DISTRIBUTION WITH ONLY ONE TRIAL, IE n = 1

Discrete Uniform distribution - When a r.v. (�) takes value inbetween two integers (For this example, integer a and integer b) such as �, � + 1, … , � − 1, � is said to follow the discrete uniform distribution. - The Discrete Uniform Distribution is denoted by � ∼ �����(�, �) - All potential outcomes between the two integers (a,b) are equally likely to occur. The probability distribution function (pdf) for a discrete uniform distribution is;

-

The two parameters of this type of distribution are the two integers the values are found between (a,b).

-

The above are the formulas for the expectation / mean or varience of a discrete unifrom distribution. An example may be, what is the outcome of rolling a fair dice? The paramaters are a = 1 and b = 6. By following the above formula, the expectation probability is 1/ 6-1+1 = 1/6.

Poisson Distribution - This method of distribution is usually used for periods within a given time frame. - A random variable may take value of 1,2,3..... to infinity. Of course, because it is a discrete probability distribution, same as the all the others above, it can only be full numbers. - The poisson distributed is often denoted as;

-

-

-

15

This method of distribution has only one parameter, being the funky L looking thing called a lambda. The Lambda represents the intensity of arrivals (average number of arrivals) within the given period of time. The expectation and variance are equal to each other and also equal the lambda.

Because � � = � , the intensity parameter � can also be interpreted as the mean arrival rate. For example, we can use Poisson distribution to model FOR EXAMPLE ;

o o

-

The number of cars passing through toll every minute. If � = 5, it means that the intensity (mean arrival rate) of cars passing through toll is 5 cars per minute. the customers coming in a KFC store every hour. If � = 140, it means that the intensity (mean arrival rate) of customers coming to the store is 140 persons per hour.

Lambda will never reach 1. goes from .0000...1 to .99999.. infinity

EXAMPLE:

LECTURE 4 Continuous Probability Distributions: a RV that takes values on (part of) the real line. Continuous RV measures. - Does not have to be whole numbers e.g. 1, 1.5, 2, 2.5

16

Continuous probability distribution can be defined via the means of probability density function (pdf) which assigns a positive value to possible outcomes that all densities integrate to 1

-

-

-

The differences between the two PDF’s is that the D in discrete PDF stands for distribution whilst the D in the continuous Distribution stands for density Another difference is that the discrete pdf probabilities all add up to 1, whilst the continuous pdf all possible outcomes integrate to 1 The continuous pdf DOES NOT restrict the ‘probabilities’ (density) to be between 0 and 1 unlike the disceret pdf (thus why it is not called probability and instead called density, because you cant have a p bigger than 1) All these ‘density’s’ integrate to 1 The random variables in continuous pdf DOES NOT need to be full numbers like discrete RV Distribution covers the real line (the real line means real numbers put on a line) or pa...


Similar Free PDFs