ECON1193-Group 12-Team 02-Fri 8AM-1 PDF

Title ECON1193-Group 12-Team 02-Fri 8AM-1
Course Business Statistics
Institution Royal Melbourne Institute of Technology University Vietnam
Pages 42
File Size 2.9 MB
File Type PDF
Total Downloads 118
Total Views 622

Summary

Download ECON1193-Group 12-Team 02-Fri 8AM-1 PDF


Description

Assessmen t 3A Team Report

Team Members  Phan Tan Phuc Cuong - s386416  Le Cong Minh - s3834313  Nguyen Ba Uyen Cat - s3757801  Nguyen Tuan Anh - s3777325  Nguyen Ngoc Anh Thu - s3741263  Chau Ngan Hong- s3813338

Lecturer  Doan Bao Huy Group 12 - Team 02

Fi r s tna me

St ude ntI D

Par t s Co nt r i but e d

Co nt r i but i o n%

Cuong

s3864161

1, 3, 4

100%

Minh

s3834313

5, 6, 7

100%

Cat

S3757801

3, 4

100%

Anh

s3777325

2, 7

100%

Hong

s3813338

5, 6

100%

Thu

s3741263

2, 7

100%

Si g nat ur e

Table of Contents PART 1: DATA COLLECTION.......................................................................................................1 PART 2: DESCRIPTIVE STATISTIC...............................................................................................1 2.1: MEASURE OF CENTRAL TENDENCY...................................................................................2 2.2: MEASURE OF VARIATION......................................................................................................2 2.3: MEASURE OF SHAPE..............................................................................................................3

PART 3: MULTIPLE REGRESSION (FROM YOUR COLLECTED DATA).............................................5 3.1: REGRESSION MODEL FOR THE DATA SET LI: THE LOW-INCOME COUNTRIES..........5 3.2: REGRESSION MODEL FOR THE DATA SET LMI: THE LOWER-MIDDLE INCOME COUNTRIES.....................................................................................................................................7 3.3: REGRESSION MODEL FOR THE DATA SET UMI: THE UPPER-MIDDLE INCOME COUNTRIES.....................................................................................................................................8 3.4: REGRESSION MODEL FOR THE DATA SET HI: THE HIGH-INCOME COUNTRIES......10

PART 4: TEAM REGRESSION CONCLUSION..............................................................................11 PART 5: TIME SERIES..............................................................................................................12 5.1: TIME SERIES TREND MODEL.............................................................................................12 A. B. C.

LINEAR TREND:..........................................................................................................................................12 QUADRATIC TREND:...................................................................................................................................14 EXPONENTIAL TREND................................................................................................................................16

5.2: RECOMMENDED TREND MODEL.......................................................................................18 5.3: PREDICTION..........................................................................................................................19

PART 6: TIME SERIES CONCLUSION.........................................................................................21 6.1: LINE CHART...........................................................................................................................21 6.2: INTERPRETATION.................................................................................................................21

PART 7: OVERALL CONCLUSION..............................................................................................22 REFERENCES:.........................................................................................................................25 APPENDIX:............................................................................................................................27 .........................................................................................................................................................29

2

PART 1: DATA COLLECTION All the data collected are conducted as primary data from the World Bank database in the year of 2017; and they are considered quantitative data. According to the World Bank, 1 current international $ would equal 1 USD. The collecting data process began when we downloaded the excel files of the world’s Gross Net Income (GNI) per capita from the World Bank Database. In a separate excel file, we randomly selected 30 countries for each income category (low, lowermiddle, upper-middle, and high income) as classified in the GNI’s file, which also qualified with the requirement for each income category in the assignment. After that, we downloaded the excel files for Domestic General Government Health Expenditure Per Capita, PPP (current international $), Immunization, Measles (% of children ages 12-23 months), Compulsory Education Duration (Years); and Child Mortality rate under 5 (per 1,000 live births) respectively on World Bank. Afterward, we transferred the data from those categories into the excel file of 120 countries using VLOOKUP function. Finally, in order to have a complete data set, we eliminated countries that missed the data for any of those categories. PART 2: DESCRIPTIVE STATISTIC The existence of outliers should be considered and analyzed in advance to ensure the accuracy in descriptive statistics and avoid missing correct results or distorting the important findings. After checking and analyzing the data, as in Figure 1, it should be noted that there are 5 significant upper outliers existing in the data set. Those include 1 upper-middle country and 3 high-income countries which have an extremely high number of child mortality under the age of 5. Panama, a high-income country, has a number of 15.9, which is extremely higher than the number of countries in the same group.

Min

>,,

(8.875)

103.2

<

161.325

0 outliers

Lower-Middle Income countries (LMI)

8.9

>

(32.58)

122.8

>

116.7

1 upper outlier

1

Upper-Middle Income countries (UMI)

3.6

>

(0.2)

31.2

>

23.8

1 upper outlier

High-Income countries (HI)

2.4

>

(0.725)

15.9

>

9.9

3 upper outliers

Figure 1: Test of outliers in each country’s category in 2017 (unit: death per 1,000 live births)

2.1: MEASURE OF CENTRAL TENDENCY

Low-Income countries (LI)

Lower-Middle Income countries (LMI)

Upper-Middle Income countries (UMI)

High-Income countries (HI)

Mean

72.3

44

12.9

5.1

Median

64.9

31.3

13.7

4

No mode

No mode

14.7

2.7

Mode

Figure 2: Central Tendency of each countries category child mortality rate under the age of 5 in 2017 (unit: death per 1,000 live births) Because there are extreme values in the dataset, they will skew the results and the Mean can no longer be the correct representation of the dataset. The mode is not an ideal approach, too. As it can be seen, the data are numerical, no value repeats in the groups of Low-Income and LowerMiddle income countries, so Mode cannot be found in these groups of the dataset. At last, Median, as a central tendency method, seems effective to describe the matter. From Figure 2, it should be noted that the Median of Low-Income Countries is the highest with 64.9 points, inferring that 50% of countries of this group has a total of child death per 1000 live births fewer than 64.9 cases. Follow this descending pattern, lower-middle-income countries are ranked second with a Median of 2

31.3 points and upper-middle-income countries are ranked third with 13.7 points. Finally, half of the High-Income Countries has fewer than 4 deaths per 1000 child live births. This supposes a possible causal relationship between low income and high child mortality.

2.2: MEASURE OF VARIATION Low-Income countries (LI)

Lower-Middle Income countries (LMI)

Upper-Middle Income countries (UMI)

High-Income countries (HI)

Interquartile Range

42.55

37.33

6

2.65

Standard Deviation

27.2

30.89

6.2

3.16

Range

70

113.9

27.6

13.5

Coefficient of Variation

38%

70%

48%

61%

Figure 3: Measure of variation of each countries category on the child mortality rate under the age of 5 in 2017 (unit: death per 1,000 live births). Due to them endures an aggregate of five upper outliers in the case, Interquartile Range (IQR) can be applied as the best measure. Since it partitions the arrangement into four equivalent parts and measures the distance average in the range of first and third quartile, causing it to get unaffected toward outliers. Based on the figure 3, the IQR of low-income countries’ group is the highest with 42.55. The following one is a group of lower-middle income countries with 37.33. Meanwhile, 6 and 2.65 are the lower figures of upper-middle and high-income country categories, respectively with the lowest is high-income countries’ group. It can be said that the child mortality rate under the age of 5 of low-income and lower-middle income countries differ greatly from the remaining country categories.

3

2.3: MEASURE OF SHAPE

Figure 4: Measure of shape of low-income countries group on the child mortality rate under the age of 5 (unit: death per 1,000 live births).

Figure 5: Measure of shape of lower-middle income countries group on the child mortality rate under the age of 5 (unit: death per 1,000 live births). It is clearly seen that the range of low-income countries’ group is lower than the one of lower-middle income countries’ group with 103.2, meanwhile, the range of lower-middle income country category is 122.8. Furthermore, there is clear allocation between two groups through figure 4 and figure 5. The child mortality rate under the age of 5 in low-income countries is 97.5 deaths per 1,000 live births. Nevertheless, in the lower-middle income country category, the rate is only 60.73.

Figure 6: Measure of shape of upper-middle income countries group on the child mortality rate under the age of 5 (unit: death per 1,000 live births).

4

Figure 7: Measure of shape of high-income countries group on the child mortality rate under the age of 5 in 2017 (unit: death per 1,000 live births). Through the box and whisker plots above (figure 6 and 7), it can be easily recognized that the range of upper-middle income country category is twice as high as the group of high-income countries is with 31.2 and 15.9 in turn. Similar to figure 4 and 5, both figures 6 and 7 also have distinct division. In upper-middle income countries’ group, the rate of child mortality under the age of 5 is 14.8, while 5.9 deaths per 1,000 live births is the child mortality rate under the age of 5 of the high-income country categories. From all the box and whisker plots above, it can be said that the rates of child mortality of low-income and lower-middle income country category are much higher than the other two groups. It displays how vital income is to mortality in children under 5 years of age.

LI

LMI

5

Factors

Left

Comparison

Right

Result

Box

9.95

<

32.6

Rightskewed

Whisker

21.75

>

5.7

Leftskewed

Median to extreme values

31.7

<

38.3

Rightskewed

Box

7.9

<

29.43

Rightskewed

Whisker

14.5

<

62.08

Rightskewed

Median to extreme values

22.4

<

91.51

Rightskewed

Conclusio n

Rightskewed

Rightskewed

UMI

Box

4.9

>

1.1

Leftskewed

Whisker

5.2

<

16.4

Rightskewed

Median to extreme values

10.1

<

17.5

Rightskewed

Box

0.75

<

1.9

Rightskewed

Rightskewed

Rightskewed

HI Whisker

0.85

<

10

Median to extreme values

1.6

<

11.9

Rightskewed

Rightskewed

Figure 8: Country categories’ box, whisker, and median comparison in 2017 (unit: death per 1,000 live births). It is obvious that four country categories have right-skewed through comparing the box, whisker, and median. The median of low-income countries’ group is the highest rate with 64.9 deaths per 1,000 live births. Then it is a group of lower-middle income countries with a median of 31.3. And upper-middle and high-income countries categories are lower with the median are 13.7 and 4, respectively. PART 3: MULTIPLE REGRESSION (FROM YOUR COLLECTED DATA) In order to build a multiple regression model for a data set, we must define all variables including the dependent and independent variables. In this case, by constructing a regression model with the aim of the mortality rate of children under 5 years old in different income categories for every country. Thus, all variables defined below are Y denoted for the dependent variable and X1, X2, X3, X4 are the independent variables. 6

 Y: The child mortality rate under 5 (per 1,000 live births) in 2017  X1: Domestic general government health expenditure per capita, PPP (current international $) in 2017  X2: Immunization, measles (% of children ages 12-23 months) in 2017  X3: Compulsory Education Duration (Years) in 2017  X4: GNI per Capita, Atlas method (US$) in 2017 3.1: REGRESSION MODEL FOR THE DATA SET LI: THE LOW-INCOME COUNTRIES a.1. Final regression output: The approach to backward elimination is used to choose the best regression model that attempts to remove independent variables that are irrelevant at the level of significance of 0.05 and considered adding those variables that are related to the dependent variable indicated in Appendix 1. Although, the final regression output model is found, its p-value is still higher than the significant level of 0.05; Based on that and the fact that all the regression output models before it also does not contain any significant variables, therefore, the regression model for the Low-Income countries is irrelevant.

Figure 9: Final regression output for the data set of The Low income in 2017 Figure 9 above is the final regression output model with the independent variable of the immunization, measles (% of children ages 12-23 months). a.2. Scatter plot:

7

Figure 10: Scatter plot of Low-Income countries’ child mortality rate under the age of 5 (per 1,000 live births) vs Immunization, measles (% of children ages 12-23 months) in 2017 b. Regression equation: In order to estimate the number of the child mortality rate under the age of 5 the final regression model will be used by the following equation: ^ Y =104.2159+(−0.4672 ) X ^ Y is denoted for the predicted number of the child mortality rate under the age of 5 and X is the percentage of children ages 12-23 months has taken the immunization for measles. c. Interpret the regression coefficient of the significant independent variable/s in the context of your research: β 1=−0.4672 The slope value of 0.4672 indicates that the number of child mortality rate under the age of 5 will decrease by 0.4672 per 1,000 live births when the percentage of children ages 12-23 months has taken the immunization for measles increase. d. Interpret the coefficient of determination in the context of your research: R2=0.0733 The coefficient of determination equals 0.0733 or 7.33% illustrates that 7.33% of variation in number of child mortality rate under the age of 5 will be explained by variation in the percentage of children ages 12-23 months has taken the immunization for measles. Based on that, for LowIncome counties, the link between the child mortality rate and the immunization, measles is rather very weak. 100% - 7.33% = 92.67% The remaining 24.11% of the variation in mortality rate of children under 5 years old could be impacted by other factors like infection, disorder, or other factors that we do not include in our study. However, since the independent variable is insignificant due to its p-value lower than the significance level 0.05, thus, it is considered invalid, and it will be rejected.

3.2: REGRESSION MODEL FOR THE DATA SET LMI: THE LOWER-MIDDLE INCOME COUNTRIES a.1. Final regression output: The final regression output model for the Lower-Middle Income countries is developed based on the application of the backward elimination process demonstrated in Appendix 2. Figure 11: Final regression output for the data set of The Lower upper income

8

a.2. Scatter plot: Figure 12: Scatter plot of Lower-Middle Income countries’ child mortality rate under the age of 5 (per 1,000 live births) vs Immunization, measle GNI per Capita, Atlas meth b. Regression equation: The final regression model will be used to estimat age of 5 by the following regression equation: ^ Y = 200.866 ^ Y = 200.866 ^ In this case, Y is denoted for the predicted num while X2 is the Immunization, measles (percent of ch per capita, Atlas method (US$). c. Interpret the regression coefficient of the significant independent variable/s in the context of your research: β 1=−1.4195 The slope value is 1.4195 shows that the number of child mortality rate under the age of 5 will decrease by 1.4195 per 1,000 live births when the percentage of children ages 12-23 months has taken the immunization for measles increase. β 1=−0.0157 Likewise, the slope value is 0.0157 indicated that the number of child mortality rate under the age of 5 will decrease by 0.0157 per 1,000 live births when the Gross Net Income per capita (US$) increase. d. Interpret the coefficient of determination in the context of your research: 2 R =0.7589 The coefficient of determination equals to 0.7589 or 75.89% means that 75.89% variation in the child mortality rate under the age of 5 will be explained by variations in the Gross Net Income per capita (US$) and the percentage of children ages 12-23 months has taken the immunization for measles. 100% - 75.89% = 24.11% On the other hand, the remaining 24.11% of the variation in mortality rate of children under 5 years old could be impacted by other factors like infection, disorder, or other factors that we do not include in our study.

3.3: REGRESSION MODEL FOR THE DATA SET UMI: THE UPPER-MIDDLE INCOME COUNTRIES a.1. Final regression output: The final regression output model for the Upper-Middle Income countries is developed based on the application of the backward elimination process demonstrated in Appendix 3. 9

Figure 13: Final regression output for the data set of The High-Income countries a.2. Scatter plot:

Figure...


Similar Free PDFs