Stata Cheat Sheet PDF

Title Stata Cheat Sheet
Author Celina Stewart
Course Managerial Decision Analysis
Institution Northwestern University
Pages 2
File Size 135.4 KB
File Type PDF
Total Downloads 45
Total Views 179

Summary

Stata cheat sheet. ...


Description

Stata Cheat Sheet

Information- and Technology-Based Marketing, Professor Florian Zettelmeyer

Command Keeping Stata up to date update all update swap Set Stata Preferences window manage prefs default

Additional Options

In "U ser" Menu

U seful for what?

N/A N/A

This tells Stata to update itself to the newest version. This needs to be issued if Stata downloaded a new "exe" file during the "update all" command. Stata will tell you when it is needed. Failure to do so will create problems with several commands.

Stata help under "update" Stata help under "update"

This restores the default window arrangement of Stata. Use if your windows are messed up or if you can’t see the Command Window.

"Stata Tutorial"

"Clear Memory (clear)", then "Set Memory for Data (set memory)"

T his allocate MB of memory to Stata. T o figure out how much to allocate, look at how big the data set it on the hard drive and double the number.

"Stata Tutorial"

"Clear Memory (clear)", then "Set Memory in Procedures (set matsize)"

This specifies that Stata can use up to variables in a regression. Usually 200 is plenty. Use the option "permanently" if you want the matsize command to stick after you quit Stata.

"Set Stata Preferences" "Restore Default Window Layout (window manage)"

More info

Ex: window manage prefs default

clear set memory m, permanently Ex: clear set memory 20m, permanently

clear set matsize

permanently

"Stata Tutorial"

Ex: clear set matsize 200

Recording your work log using

"Record your work" "Open Results Log (log using)"

Opens a "Results log" file that records everything eveying in the Results replace; append window in Stata. This file can be later opened in any text editor to cut and paste into MS Word or another word processor. If you use the option "replace" then if a log file of the same name already exists it will be overwritten. If you use the option "append" then if a log file of the same name already exists what you do next will be appended to that file.

"Stata Tutorial"

"Close Results Log (log close)" "Open Command Log (cmdlog using)"

This closes the Results log file. Opens a "Commands log" file that records your commands separately from replace; append your results so that you can quickly replicate what you did. This file can b later opened in the do-file editor to create a do-file. If you use the option "replace" then if a command log file of the same name already exists it wil be overwritten. If you use the option "append" then if a command log file of the same name already exists what you do next will be appended to tha file.

"Close Commands Log (cmdlog close)" "O pen/Save Data"

This closes the Command log file.

"Open Stata Dataset (use)" or use Toolbar

Opens a Stata data file. Make sure to specify the full path. Often it is easier to read the data in using the Toolbar.

clear

"Stata Tutorial"

"Save Stata Dataset (save)" or use Toolbar

Save data in memory as a Stata data file. Make sure to specify the full path. Often it is easier to save the data in using the Toolbar.

clear

"Stata Tutorial"

"Import From Spreadsheet (insheet)"

This allows you to import data from Excel very easily. Just make sure you have exported the data in tab-delimited format in Excel and that the first row of the spreadsheet contains the variable name you want. Often it is easier to read the data in using the dialog box from the "User" menu instead of typing the command.

"Stata Tutorial" Stata help under "insheet"

"Export To Spreadsheet (outsheet)"

This allows you to export data to Excel very easily. Often it is easier to rea the data in using the dialog box from the "User" menu instead of typing the command.

Stata help under "outsheet"

Shows a spreadsheet with all the data in it. Great to get an overview of the data. T his works just like browse but it only show the listed variables. T his works very well when you have many variables but you are only interested in a few of them.

"Stata Tutorial"

Same as browse, but with the ability to edit/modify the data.

"Stata Tutorial"

Same as browse ... , but with the ability to edit/modify the data.

Stata help under "edit"

Get a listing of all variables in the dataset and see whether they are numerical or string variables. This also tells you how much free memory you have. Some variables are displayed with string labels although they are actually coded as numbers. This allows you to see what numbers correspond to what labels.

"Stata Tutorial"

"Simple Summary Statistics (summarize)"

This is the standard command to get summary statistics (mean, standard by() deviation, min, max) for a variable. Remember that this only works well for continuous variables

"Stata Tutorial"

"Complex Summary Statistics (tabstat)"

This is a much more powerful version of the summarize command. You can by() ask for many type of summary statistics ..., including the "median", the "sum" of the values of the variables, a "count" of the number of observations, etc.

"Stata Tutorial"

"Tabulate (tabulate)"

Shows all the values of categorical variables and how many observations by() have that value

"Stata Tutorial"

"Cross-Tabulate (tabulate)"

Performs a cross tab between var1 and var2. Allows you to see whether row; column; chi2 two categorical variables are associated. When used with the "chi2" optio after the comma then you can test whether the two variables are associated.

"Stata Tutorial"

This is a good command to get a graphic depiction of how two continuous by() variables and are associated.

"Stata Tutorial" Logistics Lecture

"Draw Bar Graph (graph bar)"

This create a bar graph of over where the Y axis depicts of by() mean, the sum, or whatever statistic of is specified in .

"Stata Tutorial"

"Draw Histogram (histogram)"

T his displays the distribution of . T his is very useful to get a feel by() for what values the data have and how often they occur. If your data is discrete discrete and you want one bar for each value, use the option discrete

"Stata Tutorial"

Generates a new variable with the name from other variables and manipulations thereof.

"Stata Tutorial"

Ex: log using "C:\data\BBB_analysis.log", replace

log close cmdlog using

"Stata Tutorial"

Ex: cmdlog using "C:\data\BBB_analysis.txt", replace

cmdlog close Open/Save data use Ex: use "C:\data\BBB.dta", clear

save Ex: save "C:\data\BBB.dta", replace

insheet using

Ex: insheet using "C:\data\mydata.txt"

outsheet using Ex: outsheet using "C:\data\mydata.txt"

Viewing data as spreadsheet browse browse ...

"V iew Data as Spreadsheet" "Browse Data (browse)" or use Toolbar N/A

Stata help under "browse"

Ex: browse buyer acctnum

edit edit ...

"Edit/Modify Data (edit)" or use Toolbar N/A

Ex: edit buyer acctnum

Summarize and describe data

"Summarize and Describe Data"

describe

"Describe Data in Memory (describe)"

label list

"View Data Labels (label list)"

Ex: label list

summarize ...

"Stata Tutorial"

Ex: summarize purch last

tabstat ..., statistics( ...)

Ex: tabstat total_ purch, statistics(count mean sd median) by(gender)

tabulate Ex: tabulate gender

tabulate

Ex: tabulate gender buyer, chi2

Graph data scatter

"G raph Data" "Draw Scatter Plot (scatter)"

Ex: scatter geog art

graph bar () , over() Ex: graph bar (mean) buyer, over(region)

histogram , frequency Ex: histogram total_, frequency

Manipulate Variables and Observations generate =...

"Create and Modify Variables" "Generate New Variable (generate)"

Ex: generate ordersize=total_/purch

egen =mean(), by()

"Generate New Variables on Steroids (egen)" This create a new variable that contains the mean of for each different value of . Instead of the mean one can also use "sum", "max", "min", "count", and many other statistics that are calculated from for each category of seperately.

RFM Lecture, Stata help under "egen"

"Replace/Change Existing Variable (replace)" Replaces the value of with what is on the right hand side of the "=" sign.

"Stata Tutorial"

Ex: egen res_prob=mean(buyer), by(zip3)

replace =...

Ex: replace female=0 if female==. Ex: replace monet_dec=11-monet_dec (This "flips" the values of the N-tile to make sure that the "best" customers are always in the first N-tile. For example, for deciles this becomes: "replace =11-")

xtile =, nquantiles()

"Create deciles or quintiles (xtile)"

Creates a new variable containing the value of the N-tile based on the values of . specifies whether one gets quintiles (N=5) or deciles ( N=10) or any other desired number of categories.

RFM Lecture, "RFM_BBB_stata.do", Stata help under "xtile"

Ex: xtile purch_dec=purch, nquantiles(10)

Page 1 of 2

Stata Cheat Sheet

Information- and Technology-Based Marketing, Professor Florian Zettelmeyer

Command Manage String Variables encode , generate()

In "U ser" Menu "Manage String Variables" "Convert String Variable to Integer (encode)"

Ex: generate ordersize=total_/purh

label list

"View Data Labels (label list)"

U seful for what?

Additional Options

More info

Generates a new integer variable with the name which has a unique value for each different entry in the string variable . In addition, the string values are all shown in the value labels.

"Stata Tutorial"

Some variables are displayed with string labels although they are actually coded as numbers. This allows you to see what numbers correspond to what labels.

"Stata Tutorial"

Ex: label list

Drop/Keep Variables and Observations drop ...

"Drop/Keep Variables and Observations" "Drop Variables (drop / keep)"

Deletes , , etc. Careful, you can't get the variables back!

Ex: drop zip zip3

drop if == Simple tests of association tabulate , chi2

Drop Observations (drop if / keep if)"

Deletes observations (not variables) for which equals . The if command works as described below.

"Stata Tutorial"

Performs a cross tab between var1 and var2. Allows you to see whether row; column; chi2 two categorical variables are associated. Use the "chi2" option after the comma to test whether the two variables are associated.

"Stata Tutorial" "Tips for using...", Stats Review Lecture

"Test of Means (ttest)"

Performs a test to check whether the means for two groups are the same. The groups are described by which should have exactly two value.

"Tips for using...", Stats Review Lecture

"Correlation Between Variables (pwporr)"

Calculates the correlation beween any set of variables, two at a time.

"Tips for using...", Stats Review Lecture

"Run Regression (regress)"

Peforms a "regular", i.e. OLS regression. Works best when you have a continuous dependent variables.

"Tips for using...", Stats Review Lecture

"Generate Predicted Values (predict)"

Creates predicted values for the dependent variables based on the coeffients of the regression you performed. The predicted values are stored in . Always use right after the regression command. The predict command uses the coefficient estimates of the last regression tha you ran.

Logistics Lecture

"Test Significance of Coefficients (test)"

Performs a test of whether the coeffients of two variables are the same. You can also use "test ==X " where X is any number. This tells you whether is statistically different from that number.

Interaction Effects Lecture

"Simple T ests of Association" "Cross-Tabulate (tabulate)"

Ex: tabulate gender buyer, chi2

ttest , by() Ex: ttest total_, by(female)

pwcorr ..., sig Ex: pwcorr total_ purch last, sig

Regular regression (O LS) regress ...

"Regular Regression (O LS)"

Ex: regress salary female mba experience

predict

Ex: predict salary_predict

test ==

Ex: test female==4500

Logistic regression logistic ...

"Logistic Regression" "Run Logistic Regression (logistic)"

Peforms a logistic regression. Works only when you have a binary (0/1) dependent variable.

"Generate Predicted Values (predict)"

Same as for regular regression, see above.

coef

Logistics Lecture Logistics Lecture

"Test Significance of Coefficients (test)"

Performs a test of whether the odds ratios of two variables are the same.

Interaction Effects Lecture

see separate tab in all dialog boxes

This can be used after nearly every command (just before the comma, if there are any options) to have that command be executed only for the observations for which equals . If is a continuous variable one can also use ">", "20 & age Ex: if age40

Page 2 of 2...


Similar Free PDFs