Class Notes 8 - A cross tabulation allows you to combine two categorical variables to learn PDF

Title	Class Notes 8 - A cross tabulation allows you to combine two categorical variables to learn
Course	Empirical Methods in Economics
Institution	Hofstra University
Pages	2
File Size	121.6 KB
File Type	PDF
Total Downloads	14
Total Views	159

Preview

CLICK TO PREVIEW PDF

Summary

A cross tabulation allows you to combine two categorical variables to learn more about their relationship or their joint distribution....

Description

1. Cross-tabulations / Contingency tables - In addition to frequency tables, categorical variables (nominal/ordinal measures of association) can be examined with cross tabulation, or a contingency table. A cross tabulation allows you to combine two categorical variables to learn more about their relationship or their joint distribution. An example is to show the percentages of colleges that require/ recommend SAT scores by each type of college.  The independent variable is the type of college (not influenced by SAT policy). The dependent variable is SAT policy since it is determined by the type of college.  Independent variable (columns) Dependent variable (rows)  You can use a continuous variable in a cross-tabulation if you transform it into categories using “recode” 2. Scattergrams – a graphic representation of points referencing 2 variables and shows whether or not the two variables have a positive or negative correlation/ relationship. 3. Measures of association / PRE / Lambda - A measure of association measures the association between two or more variables (the opposite of association would be independent). It determines whether or not an association exists between two or more variables by qualitatively measuring the strength of the association and the direction of the association. All measures of association are trying to quantify the proportionate reduction in predictive error. 2 - ( R is a measure of association; it is the coefficient of determination (always positive), r is the correlation coefficient and shows us the direction of association; positive or negative  SO if r = 0.7, then R2 = 0.49, thus there is a 49% reduction of the error in predicting the dependent variable by adding the info about the independent variable, or equivalently, that only 49% of the variation of the error was explained ). The proportional reduction error (PRE) is the gain in precision of predicting a dependent variable from knowing the independent variable (quantifies the extent that knowledge about one variable can help us predict another variable). So if there is a perfect correlation, knowing variable X allows you to predict variable Y with 100% confidence, and if there is 0 correlation, knowing variable X does not help you predict variable Y. 4. Calculating Lambda - Lambda is a measure of association that is appropriate with nominal (takes mutually exclusive attributes with no ordering, ex: state of birth) and ordinal variables (ordered but the difference between attributes has no meaning). Lambda is also called the Goodman Error 1− Error 2 Kruskal measure of predictive error and it’s formula is Lambda= . Error 1

1

-

Lambda is an asymmetric measure of association (produces different values depending on which variable is considered independent), and symmetric lambda is the average of the two.

2...