Lecture #5 (Feb 9) - Oded Meyer PDF

Title Lecture #5 (Feb 9) - Oded Meyer
Author YG PP
Course Probability And Statistics
Institution Georgetown University
Pages 18
File Size 1.6 MB
File Type PDF
Total Downloads 67
Total Views 142

Summary

Oded Meyer...


Description

MATH%040%–%Lecture%#5%–%2/9/2021% eText:&&Pages&33-35,&43-58&&

! Last!time…! !

The$Boxplot $ (Another%graphical%display%of%quantitative%data)% !

The$boxplot$is$a$graphical$display$of$the$five-number$summary$ $ $$$

$ $ $

$

! !

! Boxplots!are!thus!best!used!for!comparison!of!distributions!of!the!same! quantitative!variable!across!several!groups!(side-by-side!boxplots)!

!

1!

! Example!1:!Age!of!Oscar!winners!for!acting!in!a!leading!role.! !

! !

Statistics Variable Age.Actress Age.Actor

N 40 40

Mean 38.60 44.20

StDev 13.67 9.69

Minimum 21.00 29.00

Q1 29.00 37.25

Median 33.50 42.50

Q3 44.75 50.00

Maximum 80.00 76.00

! ! Let’s!summarize!what!the!data!tell!us…! Actresses'tend'to'win'the'Oscar'at'a'younger'age'than'actors.'The'median'age'for'females' (33.5)'is'lower'than'for'the'males'(42.5).'Furthermore,'note'also'that'the'first'quartile'of' the'males''distribution'(37.25)'is'higher'than'the'median'age'for'females'(33.5).'This'tells' us'that'while'75%'of'the'actors'were'38'years'old'or'older'when'they'won'the'Oscar,'only' 50%'of'the'actresses'were'34'years'old'or'older.' !

The'range'of'typical'age'of'actors'(IQR=12.75)'is'slightly'lower'than'the'range'of'typical' ages'of'actresses'(IQR=15.75).'On'the'other'hand,'the'actresses'have'more'variability'in' their'overall'ages'(range'='59)'compared'to'the'actors'(range'='47).' ' Both'distributions'have'older'winners'that'are'outliers.'These'older'winners'are'unusual' and'skew'the'distribution'of'ages'to'the'right.' ! ! ! !

2!

Example!2:!Average!High!Temperature!(DC!vs.!SF)! !

! !

! ! The!similarities!and!differences!between!the!two!distributions!are!striking.! !

The!centers!of!the!distributions!are!fairly!close!with!the!median!average!temperature!in! San!Francisco!is!only!slightly!lower!than!that!in!DC!(Medians!63.5!vs.!65.5).!However,!the! temperatures!in!DC!have!a!much%higher!variability!than!the!temperatures!in!SF!(IQRs:! 34.75!vs.!7.25).! !

The!practical!implication!is!that!the!weather!in!SF!is!much!more!consistent!than!the! weather!in!DC!which!varies!a!lot!during!the!year.!! !

Because!the!temperatures!in!SF!vary!so!little,!knowing!that!the!median!temperature!is! around!63.5!is!actually!quite!informative.!On!the!other!hand,!knowing!that!the!median! temperatures!in!DC!is!65.5!is!practically!useless!since!it!can!get!much!warmer!or!much! colder.! !

This!example!provides!more!intuition!about!variability!by!interpreting!small!variability! as!consistency!and!large!variability!as!lack!of!consistency.!We!also!learned!that!the!center! of!the!distribution!is!more!meaningful!as!a!typical!value!when!there!is!little!variability! (or,!as!statisticians!say,!“little!noise”)!around!it.!! !

3!

Explorat

able Data

% So1far: % *EDA1for1one1variable1F1Examining1Distributions % 111111F1one1categorical1variable1 % 111111F1one1quantitative1variable % Now:1 % *EDA1for1two1variables1F1Exploring1Relationships1

% % #$0.!60!-.-3%\0!/-(-!+.!(6+!:-,&-230'5!+4,!1&,'(!'(0*!&'!(+!/&'(&.94&'$!20(600.!($0! #$,."/,$%0&#+&'*$!-./!($0!$1.*&/&)"#2%0&#+&'*$;! ! ! !

! ! !"#$%&'()* %

34% % R'! ($0,0! -! ,03-(&+.'$&*! 20(600.! 90./0,! c)-30! d! 10)-30e! -./! '-3-,%! F1+,! &.'(-.A05!/+!)-30'!)-C0!)+,0!($-.!10)-30'G7% ! ! ?@*3-.-(+,%!:-,&-230=fffffffffffffffffffffffff! ! ! ! ! _0'*+.'0!:-,&-230=fffffffffffffffffffffffffff! !

93%#),/&73.!+,!:-3$%&%3%&;#! ! ! ! 93%#),/&73.!+,!:-3$%&%3%&;#!

! ! 54%%R'!($0,0!-!,03-(&+.'$&*!20(600.!90./0,!c)-30!d!10)-30e!-./!0)*3+%)0.(! 30:03!c(0)*!d!1433T(&)0!d!)-.-90)0.(e7% ! ! ?@*3-.-(+,%!:-,&-230=fffffffffffffffffffffffff! ! ! ! ! _0'*+.'0!:-,&-230=fffffffffffffffffffffffffff! !

93%#),/&73.!+,!:-3$%&%3%&;#! ! ! ! 93%#),/&73.!+,!:-3$%&%3%&;#!

! !

!

! 4

! 64%%M-.!60!4'0!PN8!'A+,0!(+!*,0/&A(!1,0'$)-.![UN7% ! ! ?@*3-.-(+,%!:-,&-230=fffffffffffffffffffffffff! ! ! ! ! _0'*+.'0!:-,&-230=fffffffffffffffffffffffffff! !

93%#),/&73.!+,!:-3$%&%3%&;#! ! ! ! 93%#),/&73.!+,!:-3$%&%3%&;#!

! ! 74%%R'!-90!-!1-A(+,!&.!($0!*,0'0.A0!+,!-2'0.A0!+1!-!A0,(-&.!/&'0-'07% ! ! ?@*3-.-(+,%!:-,&-230=fffffffffffffffffffffffff! ! ! ! ! _0'*+.'0!:-,&-230=fffffffffffffffffffffffffff! !

! ! ! ! ! ! ! !

93%#),/&73.!+,!:-3$%&%3%&;#! ! ! ! 93%#),/&73.!+,!:-3$%&%3%&;#!

When1examining1the1relationships,1consider1the: 11111*1Role1(Explanatory,1Response),1and 11111*1Type1(Categorical,1Quantitative) of1the1two1variables:

! ! ! 8$0! ,+30T(%*0! A3-''&1&A-(&+.! (-230! 6&33! 1,-)0! +4,! /&'A4''&+.! -2+4(! 0@-)&.&.9! ($0!,03-(&+.'$&*'!20(600.!(+!:-,&-230';! two !

! 5

8&,$%9:%8&)$;"#+@% %

! ! M+)*-,&.9!($0!/&'(,&24(&+.'!+1!($0!FL4-.(&(-(&:0G!,0'*+.'0!-A,+''!($0!/&110,0.(! A-(09+,&0'!+1!($0!0@*3-.-(+,%;! ! ! • -8#/&73.*8#32-/#2=!Z0'A,&*(&:0!'(-(&'(&A'!+1!($0!,0'*+.'0!1+,!0-A$!30:03!+1! ($0!0@*3-.-(+,%!

! ! !"#$%&')*+,-.,/(* `-AC9,+4./=!U0+*30!6$+!-,0!A+.A0,.0/!-2+4(!($0&,!$0-3($!)-%!*,010,!$+(!/+9'!($-(! -,0!3+6!&.!A-3+,&0';!N!'(4/%!6-'!A+./4A(0/!2%!-!A+.A0,.0/!$0-3($!9,+4*!&.!6$&A$!gJ! )-^+,!$+(!/+9!2,-./'!60,0!0@-)&.0/5!-./!($0&,!A-3+,&0!A+.(0.('!,0A+,/0/;!R.! -//&(&+.5!0-A$!2,-./!6-'!A3-''&1&0/!2%!(%*0=!20015!*+43(,%5!-./!)0-(!F)+'(3%!*+,C! -./!20015!24(!4*!(+!>gh!*+43(,%!)0-(G;!!8$0!*4,*+'0!+1!($0!'(4/%!6-'!(+!0@-)&.0! 6$0($0,!($0!.4)20,!+1!A-3+,&0'!-!$+(!/+9!$-'!&'!,03-(0/!(+!F+,!-110A(0/!2%G!&('!(%*0;!! !

N.'60,&.9!($&'!L40'(&+.!,0L4&,0'!4'!(+!0@-)&.0!($0!,03-(&+.'$&*!20(600.!($0! A-(09+,&A-3!:-,&-230!8%*0!-./!($0!L4-.(&(-(&:0!:-,&-230!M-3+,&0';!`0A-4'0!($0! L40'(&+.!+1!&.(0,0'(!&'!6$0($0,!($0!(%*0!+1!$+(!/+9!-110A('!A-3+,&0!A+.(0.(5! * • ($0!0@*3-.-(+,%!:-,&-230!&'=! • ($0!,0'*+.'0!:-,&-230!&'=! * * * *

********************************************

!

*

! 6

!

!

!

!

!

!

7

Case%II:%Examining%the%relationship%between%two%categorical%variables% !

[%C%à%C%]% % Example(1:(Body(image(and(gender( A"random"sample"of"1200"U.S."college"students"was"asked"the"following"question:" !

!

How$do$you$feel$about$your$own$body?$Do$you$feel$that$you$are$underweight,$ overweight,$or$about$right?$ !

In"addition"to"the"response"the"question"above,"the"gender"of"each"individual"was" recorded." !

If"we"had"separated"the"1200"college"students"by"gender"and"looked"at"males"and" females"separately,"would"we"have"found"a"similar"distribution"across"the"body" image"categories?"More"specifically," !

• Are"males"and"females"just"as"likely"to"think"that"their"weight"is"about"right?"" • Among"those"who"do"not"feel"that"their"weight"is"about"right,"are"there" differences"between"the"genders?" !

Answering"these"questions"required"us"to"explore(the(relationship(between(the( two(categorical(variables(Gender(and(Body-image."Because"the"question"of" interest"is"whether"there"is"a"gender"effect"on"body"image,"" !

• The"explanatory"variable"is:" !

• The"response"variable"is:"" !

" The"first"step"is"to"summarize"the"data"using"a"two-way"table"(also"called"a" contingency"table)"of"observed"counts:" !

"

8

Supplement!the!two4way!table!with!conditional)%!of!the!response!(Body! Image)!for!each!category!of!the!explanatory!(Gender)!separately.! ! ! ! About!Right! Overweight! Underweight! Total! Males! ! ! ! ! ! Females! ! ! ! ! ! ! !

! ! What!do!the!data!tell!us?! The!distributions!of!Body!Image!of!males!and!females!are!not!the!same!! (In!other!words,!Body!Image!is!related!to!Gender!!!there!is!a!gender!effect).! For!both!genders,!roughly!70%!feel!their!weight!is!about!right.!The!difference! between!the!genders!is!among!those!who!do!not!feel!their!weight!is!about!right.!The! majority!of!females!feel!that!they!are!overweight!while!the!proportions!of!males! who!feel!they!are!overweight!and!those!who!feel!they!are!underweight!is!the!same.!

! To)summarize)the)C ! C)case…) We!explore!the!relationship!between!two!categorical!variables!using:! !

• Two4way!table!(of!counts)!+! !

• Conditional!percentages!of!the!response!for)each)level)of)the) explanatory)separately.)(can%use%a%stacked%bar.graph)% % % Note:%Whether%you%calculate%row%percentages%or%column%percentages%depends% on%whether%the%explanatory%variable%defines%the%rows%or%the%columns.% ) ! 9

Example)2:)Drinking)and)Driving)and)the)Supreme)Court) )

In!the!early!1970s!a!young!man!challenged!an!Oklahoma!State!law!that!prohibited! the!sale!of!3.2%!beer!to!males!under!21!but!allowed!its!sale!to!females!in!the!same! age!group.!The!case!(Craig%v.%Boren,%429%U.S.%190,%1976)%was!ultimately!heard!by!the! U.S.!Supreme!Court.! !

The!Supreme!Court!examined!evidence!from!a!“random!roadside!survey”!that! measured!information!on!age,!gender,!and!whether!or!not!the!driver!had!been! drinking!alcohol!in!the!previous!two!hours.!The!data!was!collected!from!randomly! chosen!619!drivers!under!20!years!of!age!and!is!presented!in!the!following!two4way! table!of!observed!counts:! ! Males! Females! Total! ! 77! 16! 93! Drunk! ! Not! 404! 122! 526! Drunk! 481! 138! 619! Total! ! !!!!!!!!!!!!!!!!!!!!!!!!! ! Question:!Is!drunk%driving!related!to!gender!?!! !!!!!!!!!!!!! ! In!order!to!answer!this!question!we!need!to!calculate!the!appropriate!conditional!%.! ! Which!of!the!following!is!the!correct!way!to!calculate!the!conditional!%!in!this!case?! ! ! ! Males! Females! Total! Drunk! 77/93=83%! 16/93=17%! 100%! ! Not! 404/526=77%! 122/526=13%! 100%! Drunk! ! ! ! Males! Females! ! 77/481=16%! 16/138=11.6%! Drunk! ! Not! 404/481=84%! 122/138=88.4%! Drunk! Total! 100%! 100%! ! !

! 10

Example)3:)Blood)Pressure)and)Cardiovascular)Death) Is!high!blood!pressure!a!risk!factor!for!cardiovascular!death?! A!longitudinal!!study!followed!a!random!sample!of!2,200!patients!for!20!years.! ! ! ! ! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Cardiovascular!Death?! YES! No! Total! ! Low!Blood! 30! 1770! 1800! Pressure! 20! 380! 400! High!Blood! Pressure! Total! 50! 2150! 2200! ! ! ! ! ! ! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Cardiovascular!Death?! YES! No! Total! ! Low!Blood! 30/1800! 1770/1800! 1800! Pressure! =!1.6%! =98.4%! High!Blood! 20/400! 380/400! 400! =!5%! Pressure! =!95%! Total! 50! 2150! 2200! ! ! ! ! ! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Cardiovascular!Death?! YES! No! Total! ! Low!Blood! 30/50! 1770! 1800! Pressure! =!60%! High!Blood! 20/50! 380! 400! Pressure! =!40%! Total! 50! 2150! 2200! ! ! ! ! !

! 11

Case!III:!Examining!the!relationship!between!two!Quantitative! Variables! !

[Q!! ! !Q]! ! Example)1:)Age)and)Memory) It!is!a!well4known!fact!that!memory!declines!with!age.!In!fact,!studies!suggest!that! memory!starts!declining!in!our!twenties,!but!only!when!we’re!older!and!there!is! enough!cumulative!decline!to!affect!our!daily!lives!that!we!actually!start!noticing!it.!! !

In!this!example!we!will!discuss!one!type!of!memory!called!working%memory.! “This!form!of!memory!is!commonly!referred!to!as!one's!attention!span!and!lasts!up! to!one!minute!before!being!erased.!For!example,!trying!to!dial!a!telephone!number! that!you!have!just!heard!requires!the!use!of!working!memory.”!(Source:!Medical! Care!Corporation!website).!! !

In!this!example!we’ll!examine!the!relationship!between!working!memory!and!age.! Understanding!how!working!memory!declines!with!age!in!normal!individuals!will! allow!us!to!identify!those!whose!memory!loss!is!more!rapid!than!normal!which! might!be!a!symptom!of!a!more!serious!medical!condition.! !

With!this!goal!in!mind!a!cognitive!psychologist!conducted!a!study!in!which!65! healthy!professional!adults!aged!25!to!70!were!given!a!series!of!tests!that!measured! their!working!memory!on!a!scale!of!04200.!! !

(Comment:!One!common!test!for!measuring!working!memory!is!known!as!the!! “Letter4Number!Sequencing”!test!in!which!a!subject!is!read,!for!example,!the! sequence!J444F414T48!and!is!required!to!repeat!it!but!place!the!numbers!in!numerical! order!and!then!the!letters!in!alphabetical!order.)! Since!the!purpose!of!this!study!is!to!explore!the!effect!of!age!on!working!memory,! • • •

the!explanatory!variable!is!Age,!and! the!response!variable!is!Memory!Score.! !!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!

! 12

Subject Subject Subject Subject . . . Subject Subject

Age

Memory Score

1 2 3 4

33 54 50 28

145 112 172 158

64 65

69 37

116 146

! !

! ! Here!is!the!completed!scatterplot:! ! ! ! !

!

! 13

Interpreting!the!Scatterplot:!

! !

!!!!!!!!!!!!!!!!!! !

! ! !



Direction:!

!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!

! ! !

! ! 14



Form:! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!

!

! !



!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! ! Strength:! ! ! ! !

!

!!!!!!!!!!!!!!!!! ! ! 15

!!



Outliers:! !

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !

!

! ! Back)to)our)example:) ) ) ) ) )

)))))))))))))))))))))))))) )

) ) ) ) ) ) )

! 16

Two$more$examples:$ $ 1. $ !

! ! Rate of reproduction is the number of people that one infected person will pass a virus on to, on average. ! • If the reproduction rate is higher than one, then the number of cases increases very fast. • If the number is lower the disease will eventually stop spreading, as not enough new people are being infected to sustain the outbreak.

! 17

2. Getting$a$better$intuition$of$the$form$of$the$relationship$$ !

A"study"examined"how"the"percentage"of"individuals"who"complete"a"survey" is"related"to"the"monetary"incentive"that"researchers"promised"in"return." " Which"of"the"three"scatterplots"below,"do"you"think,"displays"the"actual"data?" " "

!

! ! ! ! ! ! ! ! ! !

18...


Similar Free PDFs