GGR276 Lecture 5 - Yuhong He PDF

Title GGR276 Lecture 5 - Yuhong He
Author Howard Chen
Course Spatial Statistics
Institution University of Toronto
Pages 57
File Size 4.7 MB
File Type PDF
Total Downloads 67
Total Views 138

Summary

Yuhong He...


Description

GGR GGR276 276 Spatial Data Analysis and Mapping

Lecture 5 | Data Representation & Inferential Statistics Prof. Yuhong He

Midt Midterm erm Study Guide 

Is uploaded to blackboard



Midterm Date: Tuesday February 7, 2017, 1:00-3:00pm in DV2082



This is a 2 hour exam, worth 20% of your final course gradeYou are allowed to use non-programmable calculators and rulers.



You are not allowed to share calculators, rulers, or anything else with other students writing the exam.

Lecture 5 – Mid Midterm term Review Sign-in SOCRATIVE Input Room Code: HEGGR276

INPUT: LAST name, FIRST name As it appears on ROSI/ACORN/BB

Overview   

Data Representation Probability distributions Readings for this class:  Chapter 5 (5.1, 5.2, 5.4)  Chapter 6 (all)

Data for 48 students in 2016 summer GGR276

Data Representatio Representation n|G GRAPHING RAPHING 

Need a way to organize data – raw data in a spreadsheet is not useful



To draw conclusions from data, they must be organized in a meaningful way  Models  Graphs  Charts



Easiest, most convenient method: frequency distribution

HISTOGRAM Divide measurement into equal-sized categories (i.e. Bin Width). Draw a bar for each category so bars’ heights represent the number or percentage in each category. Not good for small datasets.

2016 Summer GGR276 | Height 18

Bin Width

16

16

Frequency

14

140-149 14

12

150-159

12

10

160-169

8

170-179, etc

6

R CODE:

4 2

2

1

2

190

200

> hist (classdata$happy, )

0 140

150

160

170 180 Height (cm)

More

EXCEL: Data > Data Analysis > Histogram

Presenting Qualitati Qualitative ve Data “How often” can be measured in 3 ways Use to describe: 1. Frequency (straight counts) What categories have been measured How often each category has occurred 2. Relative frequency (freq/# obs) 3. Percent = 100*Relative frequency

Data for 48 students in 2016 summer GGR276

PIE CHART R CODE: > pie(mm.vector) EXCEL: Insert > Charts > Select Pie Chart from drop-down menu

Data for 48 students in 2016 summer GGR276

R CODE:  barplot() EXCEL: Insert > Charts > Bar or Column Sport Basketball Hockey Baseball Soccer Football None

Frequenc y 19 2 4 9 4 9

Frequency

BAR CHART 20 18 16 14 12 10 8 6 4 2 0

GGR276 | Sport Preference

Data for 48 students in 2016 summer GGR276

PARETO CHART 

Contains both bars and a line graph  Bars display the values in descending/ascend ing order  Line graph shows the cumulative totals

DOT PLOT  

One dot represents each data point. Not good for large datasets

GGR276 | Pets Pet Dog Cat Reptile Fish Other No Pet

Frequency •••••••• •••••• •• •• • ••••••••••••••••••••••••••••

Fastest Ever Driving Speed 226 Stat 100 Students, Fall '98

100 Men

126 Women 70

80

90

100

110 120 130 140 Speed

150

160

Interpreting Graphs | Centre & Spread Symmetric: Even distribution Bimodal: two peaks or clusters of data

Positive Skew (Skewed right): Tail on right, majority of data on left. Negative Skew (Skewed left): Tail on left, majority of data on right.

Where is the data centered on the horizontal axis, and how does it spread out from the center?

Interpreting Graphs | Outliers

No Outliers

Outlier

• Are there any strange or unusual measurements that stand out in the data set?

Presenting Quantit Quantitative ative Data

….and many more!

Data for 48 students in 2016 summer GGR276

HISTOGRAM 2016 Summer GGR276 | Height 18

Bin Width

16

16

Frequency

14

140-149 14

12

150-159

12

10

160-169

8

Divide measurement into equal-sized categories (i.e. Bin Width). Draw a bar for each category so bars’ heights represent the number or percentage in each category. Not good for small datasets.

170-179, etc

6

R CODE:

4 2

2

1

2

190

200

> hist (classdata$happy, )

0 140

150

160

170 180 Height (cm)

More

EXCEL: Data > Data Analysis > Histogram

Og Ogii v e ((““ oh -j i v e ” ) C h a r t 

A graph that represents the cumulative frequencies for the classes in a frequency distribution

Cumulative %

100 100

80

88 79

60

69

40

48 38

20 0 1

2

23

19

6

0

3

4

5

6

7

Job Confidence Score

8

9

10

What is the percentage of students having confidence greater than 7?

What telephone bill value is at the 50th percentile?

SCATTERPLOT GGR276 Happy vs Job Confidence Job Confidence

12

This type of plot becomes more useful when plotting two variables (bivariate plots).

10 8 6 4 2 0 0

2

4

6 Happy

8

10

Bivariate SCATTERPLOT 



Summarizes the relationship between two quantitative variables Horizontal axis (X-axis) represents one variable, vertical axis (Y-axis) represents the other

R CODE: >plot(x-variable, y-variable) EXCEL: Insert > Charts > Scatter

3D GRAPHS 

In some cases, 3D graphs can tell a better story.

G e o v i s u a l i z a ti o n 

Presenting spatial data



Uses concrete visual representations and human visual abilities to enhance the communication of the spatial properties of phenomena or processes



Facilitates identification and interpretation of spatial patterns and relationships in complex data

GEOVISUALIZATION | Obesity Trends in Canada 19852003 1985-2003 



 

Definitions:  Obesity – having a high amount of body fat in relation to lean body mass. Measured using Body Mass Index (BMI) - a measure of an adult’s weight in relation to his or her height  Weight (kg)/ (Height(m))2  BMI >25 = Overweight; BMI >30 = Obese Data shown in the following slides comes from 3 sources:  HPS – Health Promotion Survey  NPHS – National Population Health Survey  CCHS – Canadian Community Health Surveys Statistics Canada, Source: P.T. Katzmarzyk, Unpublished Results. Data from: Statistics Canada. Health Indicators, June, 2004 (Acknowledgement: Dora Pouliou, PhD)

Obesity Trends Among Canadian Adults (*BMI  30, or ~ 30 lbs overweight for 5’4” person)

HPS, 1985

No Data...


Similar Free PDFs