Title | Stata Cheat Sheet |
---|---|
Author | Celina Stewart |
Course | Managerial Decision Analysis |
Institution | Northwestern University |
Pages | 2 |
File Size | 135.4 KB |
File Type | |
Total Downloads | 45 |
Total Views | 179 |
Stata cheat sheet. ...
Stata Cheat Sheet
Information- and Technology-Based Marketing, Professor Florian Zettelmeyer
Command Keeping Stata up to date update all update swap Set Stata Preferences window manage prefs default
Additional Options
In "U ser" Menu
U seful for what?
N/A N/A
This tells Stata to update itself to the newest version. This needs to be issued if Stata downloaded a new "exe" file during the "update all" command. Stata will tell you when it is needed. Failure to do so will create problems with several commands.
Stata help under "update" Stata help under "update"
This restores the default window arrangement of Stata. Use if your windows are messed up or if you can’t see the Command Window.
"Stata Tutorial"
"Clear Memory (clear)", then "Set Memory for Data (set memory)"
T his allocate MB of memory to Stata. T o figure out how much to allocate, look at how big the data set it on the hard drive and double the number.
"Stata Tutorial"
"Clear Memory (clear)", then "Set Memory in Procedures (set matsize)"
This specifies that Stata can use up to variables in a regression. Usually 200 is plenty. Use the option "permanently" if you want the matsize command to stick after you quit Stata.
"Set Stata Preferences" "Restore Default Window Layout (window manage)"
More info
Ex: window manage prefs default
clear set memory m, permanently Ex: clear set memory 20m, permanently
clear set matsize
permanently
"Stata Tutorial"
Ex: clear set matsize 200
Recording your work log using
"Record your work" "Open Results Log (log using)"
Opens a "Results log" file that records everything eveying in the Results replace; append window in Stata. This file can be later opened in any text editor to cut and paste into MS Word or another word processor. If you use the option "replace" then if a log file of the same name already exists it will be overwritten. If you use the option "append" then if a log file of the same name already exists what you do next will be appended to that file.
"Stata Tutorial"
"Close Results Log (log close)" "Open Command Log (cmdlog using)"
This closes the Results log file. Opens a "Commands log" file that records your commands separately from replace; append your results so that you can quickly replicate what you did. This file can b later opened in the do-file editor to create a do-file. If you use the option "replace" then if a command log file of the same name already exists it wil be overwritten. If you use the option "append" then if a command log file of the same name already exists what you do next will be appended to tha file.
"Close Commands Log (cmdlog close)" "O pen/Save Data"
This closes the Command log file.
"Open Stata Dataset (use)" or use Toolbar
Opens a Stata data file. Make sure to specify the full path. Often it is easier to read the data in using the Toolbar.
clear
"Stata Tutorial"
"Save Stata Dataset (save)" or use Toolbar
Save data in memory as a Stata data file. Make sure to specify the full path. Often it is easier to save the data in using the Toolbar.
clear
"Stata Tutorial"
"Import From Spreadsheet (insheet)"
This allows you to import data from Excel very easily. Just make sure you have exported the data in tab-delimited format in Excel and that the first row of the spreadsheet contains the variable name you want. Often it is easier to read the data in using the dialog box from the "User" menu instead of typing the command.
"Stata Tutorial" Stata help under "insheet"
"Export To Spreadsheet (outsheet)"
This allows you to export data to Excel very easily. Often it is easier to rea the data in using the dialog box from the "User" menu instead of typing the command.
Stata help under "outsheet"
Shows a spreadsheet with all the data in it. Great to get an overview of the data. T his works just like browse but it only show the listed variables. T his works very well when you have many variables but you are only interested in a few of them.
"Stata Tutorial"
Same as browse, but with the ability to edit/modify the data.
"Stata Tutorial"
Same as browse ... , but with the ability to edit/modify the data.
Stata help under "edit"
Get a listing of all variables in the dataset and see whether they are numerical or string variables. This also tells you how much free memory you have. Some variables are displayed with string labels although they are actually coded as numbers. This allows you to see what numbers correspond to what labels.
"Stata Tutorial"
"Simple Summary Statistics (summarize)"
This is the standard command to get summary statistics (mean, standard by() deviation, min, max) for a variable. Remember that this only works well for continuous variables
"Stata Tutorial"
"Complex Summary Statistics (tabstat)"
This is a much more powerful version of the summarize command. You can by() ask for many type of summary statistics ..., including the "median", the "sum" of the values of the variables, a "count" of the number of observations, etc.
"Stata Tutorial"
"Tabulate (tabulate)"
Shows all the values of categorical variables and how many observations by() have that value
"Stata Tutorial"
"Cross-Tabulate (tabulate)"
Performs a cross tab between var1 and var2. Allows you to see whether row; column; chi2 two categorical variables are associated. When used with the "chi2" optio after the comma then you can test whether the two variables are associated.
"Stata Tutorial"
This is a good command to get a graphic depiction of how two continuous by() variables and are associated.
"Stata Tutorial" Logistics Lecture
"Draw Bar Graph (graph bar)"
This create a bar graph of over where the Y axis depicts of by() mean, the sum, or whatever statistic of is specified in .
"Stata Tutorial"
"Draw Histogram (histogram)"
T his displays the distribution of . T his is very useful to get a feel by() for what values the data have and how often they occur. If your data is discrete discrete and you want one bar for each value, use the option discrete
"Stata Tutorial"
Generates a new variable with the name from other variables and manipulations thereof.
"Stata Tutorial"
Ex: log using "C:\data\BBB_analysis.log", replace
log close cmdlog using
"Stata Tutorial"
Ex: cmdlog using "C:\data\BBB_analysis.txt", replace
cmdlog close Open/Save data use Ex: use "C:\data\BBB.dta", clear
save Ex: save "C:\data\BBB.dta", replace
insheet using
Ex: insheet using "C:\data\mydata.txt"
outsheet using Ex: outsheet using "C:\data\mydata.txt"
Viewing data as spreadsheet browse browse ...
"V iew Data as Spreadsheet" "Browse Data (browse)" or use Toolbar N/A
Stata help under "browse"
Ex: browse buyer acctnum
edit edit ...
"Edit/Modify Data (edit)" or use Toolbar N/A
Ex: edit buyer acctnum
Summarize and describe data
"Summarize and Describe Data"
describe
"Describe Data in Memory (describe)"
label list
"View Data Labels (label list)"
Ex: label list
summarize ...
"Stata Tutorial"
Ex: summarize purch last
tabstat ..., statistics( ...)
Ex: tabstat total_ purch, statistics(count mean sd median) by(gender)
tabulate Ex: tabulate gender
tabulate
Ex: tabulate gender buyer, chi2
Graph data scatter
"G raph Data" "Draw Scatter Plot (scatter)"
Ex: scatter geog art
graph bar () , over() Ex: graph bar (mean) buyer, over(region)
histogram , frequency Ex: histogram total_, frequency
Manipulate Variables and Observations generate =...
"Create and Modify Variables" "Generate New Variable (generate)"
Ex: generate ordersize=total_/purch
egen =mean(), by()
"Generate New Variables on Steroids (egen)" This create a new variable that contains the mean of for each different value of . Instead of the mean one can also use "sum", "max", "min", "count", and many other statistics that are calculated from for each category of seperately.
RFM Lecture, Stata help under "egen"
"Replace/Change Existing Variable (replace)" Replaces the value of with what is on the right hand side of the "=" sign.
"Stata Tutorial"
Ex: egen res_prob=mean(buyer), by(zip3)
replace =...
Ex: replace female=0 if female==. Ex: replace monet_dec=11-monet_dec (This "flips" the values of the N-tile to make sure that the "best" customers are always in the first N-tile. For example, for deciles this becomes: "replace =11-")
xtile =, nquantiles()
"Create deciles or quintiles (xtile)"
Creates a new variable containing the value of the N-tile based on the values of . specifies whether one gets quintiles (N=5) or deciles ( N=10) or any other desired number of categories.
RFM Lecture, "RFM_BBB_stata.do", Stata help under "xtile"
Ex: xtile purch_dec=purch, nquantiles(10)
Page 1 of 2
Stata Cheat Sheet
Information- and Technology-Based Marketing, Professor Florian Zettelmeyer
Command Manage String Variables encode , generate()
In "U ser" Menu "Manage String Variables" "Convert String Variable to Integer (encode)"
Ex: generate ordersize=total_/purh
label list
"View Data Labels (label list)"
U seful for what?
Additional Options
More info
Generates a new integer variable with the name which has a unique value for each different entry in the string variable . In addition, the string values are all shown in the value labels.
"Stata Tutorial"
Some variables are displayed with string labels although they are actually coded as numbers. This allows you to see what numbers correspond to what labels.
"Stata Tutorial"
Ex: label list
Drop/Keep Variables and Observations drop ...
"Drop/Keep Variables and Observations" "Drop Variables (drop / keep)"
Deletes , , etc. Careful, you can't get the variables back!
Ex: drop zip zip3
drop if == Simple tests of association tabulate , chi2
Drop Observations (drop if / keep if)"
Deletes observations (not variables) for which equals . The if command works as described below.
"Stata Tutorial"
Performs a cross tab between var1 and var2. Allows you to see whether row; column; chi2 two categorical variables are associated. Use the "chi2" option after the comma to test whether the two variables are associated.
"Stata Tutorial" "Tips for using...", Stats Review Lecture
"Test of Means (ttest)"
Performs a test to check whether the means for two groups are the same. The groups are described by which should have exactly two value.
"Tips for using...", Stats Review Lecture
"Correlation Between Variables (pwporr)"
Calculates the correlation beween any set of variables, two at a time.
"Tips for using...", Stats Review Lecture
"Run Regression (regress)"
Peforms a "regular", i.e. OLS regression. Works best when you have a continuous dependent variables.
"Tips for using...", Stats Review Lecture
"Generate Predicted Values (predict)"
Creates predicted values for the dependent variables based on the coeffients of the regression you performed. The predicted values are stored in . Always use right after the regression command. The predict command uses the coefficient estimates of the last regression tha you ran.
Logistics Lecture
"Test Significance of Coefficients (test)"
Performs a test of whether the coeffients of two variables are the same. You can also use "test ==X " where X is any number. This tells you whether is statistically different from that number.
Interaction Effects Lecture
"Simple T ests of Association" "Cross-Tabulate (tabulate)"
Ex: tabulate gender buyer, chi2
ttest , by() Ex: ttest total_, by(female)
pwcorr ..., sig Ex: pwcorr total_ purch last, sig
Regular regression (O LS) regress ...
"Regular Regression (O LS)"
Ex: regress salary female mba experience
predict
Ex: predict salary_predict
test ==
Ex: test female==4500
Logistic regression logistic ...
"Logistic Regression" "Run Logistic Regression (logistic)"
Peforms a logistic regression. Works only when you have a binary (0/1) dependent variable.
"Generate Predicted Values (predict)"
Same as for regular regression, see above.
coef
Logistics Lecture Logistics Lecture
"Test Significance of Coefficients (test)"
Performs a test of whether the odds ratios of two variables are the same.
Interaction Effects Lecture
see separate tab in all dialog boxes
This can be used after nearly every command (just before the comma, if there are any options) to have that command be executed only for the observations for which equals . If is a continuous variable one can also use ">", "20 & age Ex: if age40
Page 2 of 2...