Lecture notes, lecture Sep 6 PDF

Title	Lecture notes, lecture Sep 6
Author	EMILY LEAH ARSHONSKY
Course	Research And Data Analysis In Psychology
Institution	University of California, Berkeley
Pages	4
File Size	132.8 KB
File Type	PDF
Total Downloads	44
Total Views	215

Preview

CLICK TO PREVIEW PDF

Summary

Lecture Notes Measures continued Reliability repeatable consistent results? Validity measuring what we want to measure Measurement error: problems in the way we measured our variable Measuring how someone likes facebook Amount of time spent on it of hours) Active vs. passive use of app (way they use...

Description

9/6 Lecture Notes Measures continued ● Reliability → repeatable (test/retest) → consistent results? ● Validity → measuring what we want to measure Measurement error: problems in the way we measured our variable ● Measuring how someone likes facebook ● Amount of time spent on it (# of hours) ● Active vs. passive use of app (way they use facebook) ● In comparison to other social media sites ● Watch people’s facial expression as they use it (using camera) ● Neurotransmitters released when using it Reducing measurement error ● Operationalize our variables ● Draw from different sources of data → various measurements, see how much they merge Observational data ● Ratings made by others ○ Good → external, multiple sources, real-world ○ Bad → difficult to collect and measure, observers biased, multiple interpretations ● Hawthorne effect: observation changes behavior ○ Laboratory environment (to control surrounding features for consistency) ○ Awareness of being observed alters behavior ○ How representative is this of real life? ● Habituation: people get used to observation ● Jane Goodall’s chimpanzee observational technique ○ Biased when assigning names to chimps (connotations to names) ■ Agenda? Humanizing the animals? ○ Human in chimpanzee environment ○ Overall good technique Self data ● What a person says about herself/himself (self-report surveys, responses to interviews) ○ Good → Structured and easy, people do have knowledge about themselves, access to private states (values) ○ Bad → People are biased (self-enhancement), people might have lack of knowledge about their behavior, people may lie about sensitive issues Final project → definitely self-data, extra credit for observational data Recap: ● Who do we sample from?

○ Population vs. sample ○ Sampling biases ● What do we measure? ○ Quality of data (reliability, validity) ○ Sources of data (observation, self) Defining our Data ● Data are information ● Quantitative data: concepts reduced to numbers ○ Categorical → values of the variable represent groups (data fall into categories) ○ Continuous → values of the variable represent an “infinite” range of possibilities (data on a spectrum) ● Operationalization ○ Dependent variable must be CONTINUOUS ○ Independent variable can be any type of quantitative variable ● Qualitative not reduced to numbers Identity Spectrum ● Gender/sex = spectrum ● Should measure sample age, # of participants, maybe gender/sex Likert Scales → way to assess continuous variables ● Aggregation: alone, each question is categorical ● Combine related questions together to create a continuous distribution ● Features of a good likert scale ○ Multiple face-valid questions (items) ○ Some reverse scored items (same variable, but opposite end of spectrum) ○ Response scale → odd #, balanced, labeled ■ Not at all → all the time ■ Strongly disagree → strongly agree KEY IDEA: don’t reinvent the wheel ● Utilize other studies ● http://ipip.ori.org/newIndexofScaleLabels.htm ○ Evaluate the items → is this measuring what I want to measure? ○ Adapt/edit as necessary → just explain what you did Data organization ● Vector = one dimensional set of same type data ○ Stores data as a variable ○ Index [i] ● Data.frame = two dimensional collection of vectors ○ Stores related variables (dataset) ○ Index [i (row), i (column)]

○ Row, column → RC Example: data [1, 4] RC (1st row, 4th column) → will give you 1 person

hours

way using facebook

happiness

1

2

A

1

2

3

P

3

3

6

P

5

4

1

A

6

Data[3, 1] → 3 Navigating Data Sets in R ● Structure ○ Each row is a participant ○ Each column is a variable ● Navigation ○ head(data) → prints the first few rows ○ data[1] OR data[,1] → prints first column ○ Data[1,] → prints first row ○ data$variable → prints that variable ■ $ = object within another object Data Cleaning ● Your data will often need some “cleaning” before they can be analyzed ○ TutoRial2 → module 3 ■ Rename a variable ■ Remove an obviously wrong entry ■ Edit an obviously wrong entry ■ Change type of data Describing our Data → a language to talk about data ● Continuous vs. categorical distributions ● Centrality = “the three M’s” ○ Mean ○ Median ○ Mode ● Complexity = “the three S’s” ○ Shape ○ Spread

○ Standard Deviation ● Distribution ○ Histogram - a visualization for one variable ■ Categorical → plot(data$CatVar) ■ Continuous → hist(data$ContVar) ○ Three things to look for: ■ Centrality - how are data clustered? ■ Complexity - how are data spread out? ■ Clarity - do the data make sense? ○ Do not just accept data as truth ■ Is this correct or incorrect? Theoretical distribution: ● What we expect the distribution to look like given true random probability of occurrence ● The shape of the dist. depends on the type of data Familiar concepts → expected distribution ● Symmetry usually Normal distribution ● Large percentage of scores at the center of distribution ● Small percentage of scores expected to be at extreme ends of the distribution ● Normal distribution → defines the probability of a range of scores ○ Hypothetical distribution that occurs when there are multiple explanations for complexity, and when these explanations occur randomly ○ An assumption...