Summation PDF

Title Summation
Author Daniel Filmer
Course Business Operations Management
Institution Western Sydney University
Pages 38
File Size 1.5 MB
File Type PDF
Total Downloads 60
Total Views 123

Summary

the course so far...


Description

TUTORIAL 1 Summation: Summation is the operation of adding a sequence of numbers; the result is their sum or total. For instance, the sum of the sequence {1, 5, ‐2, 3} is 1 + 5 + (‐2) + 3 = 7. If the terms of a sequence are given by a regular pattern, then a summation operator may be useful. Suppose that we want the total of the sequence of consecutive integers from 15 to 545. We may use an addition expression involving an ellipsis to indicate the missing terms: 15 + 16 + ... + 544 + 545. In this case we easily understand the pattern. Alternatively we may use the summation operator "Σ", the capital‐sigma notation. Using the sigma notation the above summation can be written as: Note that, when c is a constant, = n × c Summation   5 4 5 1 5 Summation notation This appendix offers an introduction to the use of summation notation. Because summation notation is used extensively throughout statistics, you should review this appendix even if you have had previous exposure to summation notation. Our coverage of the topic begins with an introduction to the necessary terminology and notation, follows with some examples, and concludes with four rules that are useful in applying summation notation. Summation Notation

Often mathematical formulae require the addition of many variables Summation or sigma notation is a convenient and simple form of shorthand used to give a concise expression for a sum of the values of a variable. Let x1, x2, x3, …xn denote a set of n numbers. x1 is the first number in the set. xi represents the ith number in the set. Summation notation involves: The summation sign This appears as the symbol, S, which is the Greek upper case letter, S. The summation sign, S, instructs us to sum the elements of a sequence. A typical element of the sequence which is being summed appears to the right of the summation sign. The variable of summation, i.e. the variable which is being summed The variable of summation is represented by an index which is placed beneath the summation sign. The index is often represented by i. (Other common possibilities for representation of the index are j and t.) The index appears as the expression i = 1. The index assumes values starting with the value on the right hand side of the equation and ending with the value above the summation sign.

The starting point for the summation or the lower limit of the summation The stopping point for the summation or the upper limit of summation

Some typical examples of summation

This expression means sum the values of x, starting at x1 and ending with xn.

This expression means sum the values of x, starting at x1 and ending with x10.

This expression means sum the values of x, starting at x3 and ending with x10.

The limits of summation are often understood to mean i = 1 through n. Then the notation below and above the summation sign is omitted. Therefore this expression means sum the values of x, starting at x1 and ending with xn.

This expression means sum the squared values of x, starting at x1 and ending with xn.

Arithmetic operations may be performed on variables within the summation. For example:

This expression means sum the values of x, starting at x1 and ending with xn and then square the sum.

Arithmetic operations may be performed on expressions containing more than one variable. For example:

This expression means form the product of x multiplied by y, starting at x1 and y1 and ending with xn and yn and then sum the products.

In this expression c is a constant, i.e. an element which does not involve the variable of summation and the sum involves n elements.

Problems Data

i

xi

1

1

2

2

3

3

4

4

1. Find

2. Find

Data

i

xi

1

-1

2

3

3

7

and c which is a constant = 11

3. Find

FREQUENCY DISTRIBUTION TABLE A frequency distribution table is a chart that summarizes values and their frequency. It's a useful way to organize data if you have a list of numbers that represent the frequency of a certain outcome in a sample. A frequency distribution table has two columns. The first column lists all the various outcomes that occur in the data, and the second column lists the frequency of each outcome. Putting this kind of data into a table helps make it simpler to understand and analyze.

STEM AND LEAF PLOTS Stem-and-leaf plots are a method for showing the frequency with which certain classes of values occur. You could make a frequency distribution table or a histogram for the values, or you can use a stem-and-leaf plot and let the numbers themselves to show pretty much the same information.

FREQUENCY HISTOGRAM

A graph that displays measurements of a process output based on frequency of occurrence. A frequency histogram is used to visualize process outputs in order to measure the performance of a business or manufacturing process, and to identify opportunities for process improvement.

FREQUENCY POLYGONS Frequency polygons are line graphs joined by all the midpoints at the top of the bars of histograms. Frequency polygons illustrate the shape of the distribution of the data. The endpoints of a frequency polygon lie on the x-axis. To construct a frequency polygon, a bar graph or histogram must first be drawn. Then, a line graph is drawn over the bar graph. Frequency polygons and histograms are similar in that they show the same information but in a different manner. When comparing two different sets of data, it is often easier to display the data using a frequency polygon versus a histogram. Some examples of data illustrated in a frequency polygon are the number of vehicles passing through a particular point in a route, the number of hours a student in a class spends studying and the display of exam scores between two different sets of students.

1.

Draw a histogram

Using the data in question, formulate the class intervals. Construct a table showing the class intervals and the corresponding values of the frequencies. The table makes it easy for you to draw the histogram. Use a pencil and a ruler to draw the Cartesian plane. Let the x-axis represent the class intervals, whereas the y-axis shows the frequencies of the class intervals. Plot the values of the frequencies against the class intervals. Draw the bars of the histogram to represent this data. 2.

Calculate the values of the midpoints

Find the value of the midpoint of each class interval. The midpoint of each class is calculated by getting the sum of the class boundaries and then dividing it by two. 3.

Draw the frequency polygon

Using the midpoint values, mark the middle of each bar at the top with a pencil. Join the points with line segments. Ensure that the lines on the two ends of the graph touch the x-axis of the plane in order to demonstrate the continuity of the data values. FREQUENCY OGIVE The word Ogive is basically a term used in the architecture to describe curves or curved shapes. Ogives are graphs that are used to estimate how many numbers lie below or above a particular variable or value in a data. In order to construct ogive, first the cumulative frequency of the variables is calculated using a frequency table. It is done by adding the frequencies of all the previous variables in the given data set. The result or the last number in the cumulative frequency table is always equal to the total frequencies of the variables . An Ogive (pronounced O-Jive) is a graph that represents the cumulative frequencies for the classes in a frequency distribution and it is a continuous frequency curve. Ogive has the shape of an elongated 'S' and is sometimes called a double curve with one portion being concave and the other being convex. While constructing, it is necessary to first have the frequency table. To plot an ogive, we need class boundaries and the cumulative frequencies. For grouped data, ogive is formed by plotting the cumulative frequency against the upper boundary of the class. For ungrouped data, cumulative frequency is plotted on the y-axis against the data which is on the x-axis.

Ogive : Ogive is the term named derived by the 13th-century itinerant master-builder named Villard de Honnecourt. In statistics, Ogives is the free hand graph to find how many data lies above or below the particular data. Its used for showing the graph of cumulative distribution function.

Example :

Find the third and eighth deciles (30th and 80th percentiles) of the following data set: 26, 23, 29, 31, 24, 22, 15, 31, 30, 20. Solution: Writing the Data in ascending order 15, 20, 22, 23, 24, 26, 29, 30, 31, 31. (1 mark) L30 position = 0.3 ∗ (n + 1) = .3 ∗ 11 = 3.3 term (1 mark)

L30 = 22 + 0.3 ∗ (23 - 22) = 22.3 (1 mark) L80 position = 0.8 ∗ (n + 1) = 0.8 ∗ 11 = 8.8 (1 mark) L80 = 30 + 0.8 ∗ (31 - 30) = 30.8 (1 mark)

Learn How to Calculate Deciles Statistics - Definition, Formula, Example Definition: A system of dividing the given random distribution of the data or values in a series into ten groups of similar frequency is known as deciles. Formula:

Where, Li = Lower limit of the decile class N = Sum of the absolute frequency Fi-1 = Absolute frequency lies below the decile class a i = Width of the class containing the decile class Example: Find the deciles for the following data. 3, 15, 24, 28, 33, 35, 38, 42, 43, 38, 36, 34, 29, 25, 17, 7, 34, 36, 39, 44, 31, 26, 20, 11, 13, 22, 27, 47, 39, 37, 34, 32, 35, 28, 38, 41, 48, 15, 32, 60, 56, 13. Given, Data = 3, 15, 24, 28, 33, 35, 38, 42, 43, 38, 36, 34, 29, 25, 17, 7, 34, 36, 39, 44, 31, 26, 20, 11, 13, 22, 27, 47, 39, 37, 34, 32, 35, 28, 38, 41, 48, 15, 32, 60, 56, 13. N = 42 To Find, Deciles Statistics Solution: Step 1: Let us calculate the Cumulative Frequency value for the given data, Cumulative Frequency is calculated using the formula, Fi = Fi-1 + fi Class

Frequency fi

Cumulative Frequency Fi

3 - 10

2

2

10 - 17

5

7

17 - 24

3

10

24 - 31

7

17

31 - 38

12

29

38 - 45

9

38

45 - 52

2

40

52 - 59

1

41

59 - 66

1

42

In the above table, class denotes the range of values Frequency (fi) denotes the number of values between the class range from the given data. For Ex: Class range 3-10 represents that there are two values i.e. 3 and 7 in the given data Step 2: Now let us calculate the deciles for each class. Calculation of First Decile: First, consider k = 1 Since the value is 4.2, it is present inside the class range [3, 10)] Now, substitute the values in the formula, D1 = 3 + (4.2-0) / 2 * 7 D1 = 3+2.1*7 D1 = 17.7 Step 3: Calculation of Second Decile: Consider, k = 2 K.N / 10 = 2*42 / 10 = 8.4 The value is 8.4. So, it is present inside the class range [10, 17)] Substitute the values in the formula, D2 = 10 + (8.4-2) / 5 * 7 D2 = 3+1.28*7 D2 = 18.96 Similarly, calculate the deciles for rest of the seven classes. QUARTILES

Quartiles divide a set of numbers into four equal parts. To find the highest and lowest quartiles in a data set, first find the median of the entire set of numbers. Treat the sets of numbers above and below the median as separate sets, and then locate the medians of these groups. The median of the lowest set of numbers corresponds to the lowest, or first, quartile. The median of the entire set corresponds to the second quartile, whereas the median of the highest set corresponds to the highest, or third, quartile. Quartiles are calculated differently for odd and even numbers of values. If a data set has an odd number of digits, the median is the middle value. If a data set has an even number of digits, the median is the average of the middle two numbers. To find this average, add the two middle numbers together, and then divide the sum by two.

Quartile

A quartile is a type of quantile. The first quartile (Q1) is defined as the middle number between the smallest number and the median of the data set. The second quartile (Q2) is the median of the data. The third quartile (Q3) is the middle value between the median and the highest value of the data set. In applications of statistics such as epidemiology, sociology and finance, the quartiles of a ranked set of data values are the four subsets whose boundaries are the three quartile points. Thus an individual item might be described as being "in the upper quartile". Definitions

Boxplot (with quartiles and an interquartile range) and a probability density function (pdf) of a normal N(0,1σ2) population Symb ol

Q1

Q2

Q3

Names

Definition

first quartile splits off the lowest 25% of data from the lower highest 75% quartile 25th percentile secon d quartile media cuts data set in half n 50th percentile third quartile splits off the highest 25% of data from the upper lowest 75% quartile 75th percentile

Computing methods[edit] For discrete distributions, there is no universal agreement on selecting the quartile values.[1] Method 1[edit] 1. Use the median to divide the ordered data set into two halves. o If there are an odd number of data points in the original ordered data set, do not include the median (the central value in the ordered list) in either half. o If there are an even number of data points in the original ordered data set, split this data set exactly in half. 2. The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data. Method 2[edit] 1. Use the median to divide the ordered data set into two halves. o If there are an odd number of data points in the original ordered data set, include the median (the central value in the ordered list) in both halves. o If there are an even number of data points in the original ordered data set, split this data set exactly in half. 2. The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data. Method 3[edit] 1. If there are an even number of data points, then Method 3 is the same as either method above since the median is no single datum. 2. If there are (4n+1) data points, then the lower quartile is 25% of the nth data value plus 75% of the (n+1)th data value; the upper quartile is 75% of the (3n+1)th data point plus 25% of the (3n+2)th data point. 3. If there are (4n+3) data points, then the lower quartile is 75% of the (n+1)th data value plus 25% of the (n+2)th data value; the upper quartile is 25% of the (3n+2)th data point plus 75% of the (3n+3)th data point. This always gives the arithmetic mean of Methods 1 and 2; it ensures that the median value is given its correct weight, and thus quartile values change as smoothly as possible as additional data points are added. Example 1[edit] Ordered Data Set: 6, 7, 15, 36, 39, 40, 41, 42, 43, 47, 49 Metho Metho Metho d1 d2 d3 Q

15

25.5

20.25

40

40

1

Q 40

2

Q

43

42.5

42.75

3

Example 2[edit] Ordered Data Set: 7, 15, 36, 39, 40, 41 As there are an even number of data points, all three methods give the same results. Metho Metho Metho d1 d2 d3 Q

15

15

15

37.5

37.5

37.5

40

40

40

1

Q 2

Q 3

There are methods by which to check for outliers in the discipline of statistics and statistical analysis. As is the basic idea of descriptive statistics, when encountering an outlier, we have to explain this value by further analysis of the cause or origin of the outlier. In cases of extreme observations, which are not an infrequent occurrence, the typical values must be analyzed. In the case of quartiles, the Interquartile Range (IQR) may be used to characterize the data when there may be extremities that skew the data; the interquartile range is a relatively robust statistic (also sometimes called "resistance") compared to the range and standard deviation. There is also a mathematical method to check for outliers and determining "fences", upper and lower limits from which to check for outliers. After determining the first and third quartiles and the interquartile range as outlined above, then fences are calculated using the following formula: where Q1 and Q3 are the first and third quartiles, respectively. The Lower fence is the "lower limit" and the Upper fence is the "upper limit" of data, and any data lying outside these defined bounds can be considered an outlier. Anything below the Lower fence or above the Upper fence can be considered such a case. The fences provide a guideline by which to define an outlier, which may be defined in other ways. The fences define a "range" outside of which an outlier exists; a way to picture this is a boundary of a fence, outside of which are "outsiders" as opposed to outliers.

Interquartile range In descriptive statistics, the interquartile range (IQR), also called the midspread or middle 50%, or technically H-spread, is a measure of statistical

dispersion, being equal to the difference between 75th and 25th percentiles, or between upper and lower quartiles,[1][2] IQR = Q3 − Q1. In other words, the IQR is the first quartile subtracted from the third quartile; these quartiles can be clearly seen on a box plot on the data. It is a trimmed estimator, defined as the 25% trimmed range, and is the most significant basic robust measure of scale. The IQR is a measure of variability, based on dividing a data set into quartiles. Quartiles divide a rank-ordered data set into four equal parts. The values that separate parts are called the first, second, and third quartiles; and they are denoted by Q1, Q2, and Q3, respectively.

Boxplot (with an interquartile range) and a probability density function (pdf) of a Normal

N(0,σ ) Population 2

Use[edit] Unlike total range, the interquartile range has a breakdown point of 25%,[3] and is thus often preferred to the total range. The IQR is used to build box plots, simple graphical representations of a probability distribution. For a symmetric distribution (where the median equals the midhinge, the average of the first and third quartiles), half the IQR equals the median absolute deviation (MAD). The median is the corresponding measure of central tendency. The IQR can be used to identify outliers (see below). The quartile deviation or semi-interquartile range is defined as half the IQR. [4][5] Algorithm[edit] Quartiles are calculated recursively, by using median. [6]

If the number of entries is an even number 2n, then the first quartile Q1 is defined as first quartile Q1 = median of the n smallest entries and the third quartile Q3 = median of the n largest entries[6] If the number of entries is an odd number 2n+1, then the first quartile Q1 is defined as first quartile Q1 = median of the n smallest entries and the third quartile Q3 = median of the n largest entries[6] The second quartile Q2 is the same as the ordinary median.[6] Examples[edit] Data set in a table[edit] The following table has 13 rows, and follows the rules for the odd number of entries. i

x[i]

1

7

2

7

3

31

4

31

5

47

6

...


Similar Free PDFs