Title | SAS Tutorial 04 Logistic Regression |
---|---|
Course | Predictive Science & Engineering Design Cluster Seminar |
Institution | Northwestern University |
Pages | 4 |
File Size | 141.1 KB |
File Type | |
Total Downloads | 86 |
Total Views | 194 |
Download SAS Tutorial 04 Logistic Regression PDF
SAS Tutorial: Logistic Regression Data Directory: Data can be accessed on the SAS OnDemand server using this libname statement. libname mydata '/home/raymondanden/my_courses/donald.wedding/c_8888/PRED411/UNIT00' access=readonly; NOTE: Change
raymondanden
To your SAS STUDIO USER ID.
Data Set:
mydata.financial_ratios
Source of Data:
Regression Analysis By Example by Chatterjee and Hadi, pg 338.
Data Description:
Y: Business status after 2 years (1=Still in business , 0=Bankrupt) X1: Retained Earnings X2: Earnings Before Interest and Taxes X3: Sales
Tutorial Instructions: For this tutorial we demonstrate how to fit and interpret a Logistic Regression Model and produce the default graphics using PROC LOGISTIC. (1) Use PROC MEANS procedure to produce simple univariate descriptive statistics for numeric variables. PROC MEANS requires at least one numeric variable. Select the distribution by identifying the percentile ‘min’, p5, p25, etc., and set the number of decimals with ‘ndec=’ * examine means of continuous variables for predictive relevance to response variable; Title "Logistic Regression EDA - Examine Means"; * examine means at min, 25th, 50th, 75th, and max percentile; proc means data= mydata.financial_ratios min p25 p50 p75 max ndec=2; class Y; var X1 X2 X3; run; * examine means at 5th, 10th, 25th, 50th, 75th, 90th, and 95th percentile; proc means data= mydata.financial_ratios p5 p10 p25 p50 p75 p90 p95 ndec=2; class Y; var X1 X2 X3; run;
Since we are attempting to identify predictive relevance to response variable Y, use the class statement to show statistics by Y=1 and Y=0.
(2) Upon examination of the PROC MEANS results, there may be a requirement to create discretized variables. Use DATA statement to create discrete variables from the continuous variables. Then use PROC FREQ to examine frequencies for relationship of discretized variables to response variable. data bankrupt; set mydata.financial_ratios; * example of discretizing continuous variables; if (X1...