Intro SAS - Summary Biostatistics I PDF

Title Intro SAS - Summary Biostatistics I
Course Biostatistics I
Institution Brock University
Pages 10
File Size 256.3 KB
File Type PDF
Total Downloads 63
Total Views 147

Summary

Intro SAS...


Description

Introduction to SAS WHAT IS SAS? SAS stands for Statistical Analysis Software. The benefits and features too numerous to mention but for the statisticians use it for anything from traditional analysis of variance and predictive modeling to exact methods and statistical visualization techniques. LAUNCHING SAS System-dependent. It usually involved double-clicking on the SAS icon. In Windows 7 and higher, type “SAS” in Start Menu, “Search programs and files” box. THE SAS ENVIRONMENT There are 3 windows in SAS that are very important: 1. Editor: This is the window in which you write your code. 2. Log: After running a program, this window shows you any error or warning messages generated by the program. It also shows you details about the datasets that you created with your program. 3. Output: Often, your SAS code will output certain statistics or other data. This is displayed in the output window. PERMANENT VS. TEMPORARY DATASETS Libraries in SAS are folders where data sets are stored. Temporary Datasets: When you start SAS, the “WORK” library is automatically created. This library stores any temporary datasets that you create during your SAS session, however when SAS is closed, all datasets in the WORK library are deleted. This is ok for small datasets that can be created quickly, however sometimes you will want to save the dataset (particularly if it is very large) in order to save time next time you open SAS and want to use the data. All temporary dataset names have only one part (eg. ‘temp’ or ‘cohort’) Permanent Datasets: In order to save a SAS dataset, you first have to create a library that exists on your computer (e.g. a directory in Windows or UNIX or Mac). For example, if you want to save datasets in the directory “c:\biostats\SAS datasets” in Windows, you do the following:  First make sure that this directory exists – if the directory doesn’t exist, SAS will NOT create it for you  Next, you tell SAS to make a link between the Windows directory and a shorthand library name called the ‘libname’. For example, the libname could be ‘homework’ or ‘project’ or ‘HIV’, etc. This is done by typing the following into the Editor window:

Intro SAS

Page 1 of 10

libname homework ‘c:\math2p98\SAS datasets’;

   

Use your cursor to highlight the text you’ve typed in (position the cursor at the start of the text, hold down the left mouse button, and highlight the text). Select the little icon of a running person at the top of the screen (submit) or press F3 Check the log file. Now, look in the Explorer window – the ‘homework’ library should have been added to the library list

All permanent dataset names have the following form: . For example, if you are using the library ‘homework’, and create a dataset called ‘cohort’ in this library, you refer to this dataset as “homework.cohort” in your SAS program. Note: Any libnames that you create (eg homework) are only temporary mappings between SAS and the Windows directory. This mapping disappears when you exit from SAS. Therefore, in order to access the permanent dataset after exiting SAS, you must reenter the libname statement from above in order to relink the directories. Any permanent datasets in the Windows directory will then be accessible from SAS. READING DATA INTO SAS You can access data in SAS in various ways. In each of these cases, you can read in data as either a temporary OR a permanent dataset, depending on whether you want to keep the dataset after closing SAS. 1. Entering data by hand Type the following into your SAS editor window: /* sample program */ data temp; input id age sex $ treatment $ result; square_root_age = sqrt(age); datalines; 1 65 f a 10 2 78 f a 8 3 65 m a 9 4 45 m a 9.3 5 53 m b 11 6 61 m b 7 7 58 m b 7.5 8 72 f b 7.8 9 67 f b 8.7 10 73 f b 9.5 ;

Intro SAS

Page 2 of 10

Things to note:  Comments are enclosed between “/*” and “*/”. Although the purpose of a SAS program may seem very obvious to you when you write it, it is a good idea (trust me on this) to use copious comments. At the top of the program, it is a good idea to indicate the purpose of the program. At various points in the program, you should include comments explaining the purpose of the code.  Indentation will make your SAS program easier to read.  All SAS commands terminate with a semicolon (;). Leaving out the semicolon will probably be the most frequent mistake made.  The creation of a dataset starts with the keyword “data”, followed by the name of the dataset. Names can be any length, can consist of letters, numbers, and underscores, but must start with a letter. SAS does not distinguish between lower and upper case names. If you only specify a one-part name (such as above), the dataset will be temporary. If you have created a library, you can make this dataset permanent by adding the library name before the dataset name (eg. using our homework library above, we could create homework.temp).  The “input” line tells SAS how many variables there are, their names, and the order in which they appear. SAS assumes that all variables are numeric unless you tell it otherwise. A ‘$’ after a variable name indicates a character variable – that is, a variable whose values are non-numeric.  The ‘datalines;’ command indicates that the lines which follow contain data.  SAS detects the end of one value and the start of the next by the presence of one or more blank spaces. So you can’t have data values which contain spaces (such as name = ‘Mary Smith’) unless you take precautions, which we won’t go into here.   

Run your program (highlight the text and click on the button with a running man on it). Check your log file If you look in the “work” library, you can see the dataset “temp”. If you double click on this, you can see what your dataset looks like If you want to make this a permanent data set, you could change the name to a two-part name (eg. homework.new_dataset) in the code entered above, or you could write the following code in the program editor: data homework.new_dataset; set temp; run;



This line of code does the following: o SAS goes into the ‘work’ directory, finds the dataset called ‘temp’ and reads the contents of that file into another dataset called ‘new_dataset’. o This dataset is saved as new_dataset.sas7bdat in the Windows directors pointed to by the libname ‘homework’.

Intro SAS

Page 3 of 10

 

If you now look at the contents of c:\biostats\SAS datasets, you will see a file called temp.sas7bdat. The icon will show you that it is a SAS dataset. If you click on this icon, SAS will start up and display the contents of the dataset to you. When reading non-SAS datasets, be aware that the default length of a character variable in SAS is 8 characters. If you create a variable to read in character data, by default it will read in only the first 8 characters of a data value. To see what I mean, try the following SAS code: data newdata; input char $ num; datalines; longcharacterstring 10 short 11 ;



You can change the length of a character variable (that is, change the number of characters which can be stored in the variable). In the following example, note the ‘$’ in front of the 20. When you are telling SAS what length to make a variable, you must also tell SAS if it is a character variable (the default is to make it a numeric variable). data newdata; length char $20; input char $ num; datalines; longcharacterstring 10 short 11 ;

2. Reading in pre-existing SAS datasets with .sas7bdat extensions A file which has the extension .sas7bdat can be read directly into SAS. Specify the location of the file (the directory in which the file is located) using a “libname” statement within your SAS program, and then read the file in. For example, suppose you have been given a file called studydata.sas7bdat and you have stored the file in the directory c:\thesis\data. libname thesis ‘c:\thesis\data’; data mydata; set thesis.studydata; run;

3. Reading in non-SAS datasets a) Text files: In Windows, a text file is generally indicated by the file extension .txt or .asc, or perhaps .dat. If the file is short enough, the simplest thing may be to copy and paste the data into your SAS program.

Intro SAS

Page 4 of 10

The simplest sort of text file to read in is one with no header, with all of the data values for a single observation on one line, and where the variables are separated from one another by one (or more) spaces or tabs, with no missing values. In this simple case, the file is specified using an ‘infile’ statement, followed by an ‘input’ statement. The ‘infile’ statement must come first, because SAS has to locate the file before it can execute the ‘input’ statement: For the following example, first download uissurv.dat (Data File link for UMASS Aids Research Unit study) from: http://www.umass.edu/statdata/statdata/stat-survival.html data from_text; infile 'c:\biostats\sas datasets\uissurv.dat'; input ID age becktota hercoc ivhx ndrugtx race treat site los time censor; run;

If a numeric value is missing (e.g. age), the text file must contain a single period (.) in place of the missing value. In the following example, age is missing for observation 3: 001 65 F 002 38 M 003 . F 004 78 F

24.9 29.7 31.2 32.5

If a character value is missing (e.g. sex), if the values are separated by something other than a space (the usual non-space delimiter is a comma), if a single observation is split up over more than one line, and/or if there is more than one observation on a single line, ask for help.

b) Excel Files The data on the Pagano CD (at least for the second edition of the book, copyright 2000) was created using Excel 2.1. I had to open up the data using Excel and re-save it (use the “save as” command) in Excel 2000 before I could read it into SAS. Data from up-to-date Excel files can be read into SAS using the SAS Import Wizard.   

From the file menu, select “import data”. Specify the type of file you want to import (e.g. Excel 2000) and the give the location of the Excel file. You are asked to indicate a ‘library’ and ‘member’. The library can be ‘work’ if you want to create a temporary dataset, or you can choose any other libraries that you have specified in SAS (eg. ‘homework’) from the dropdown menu. Under “member”, enter a name for the dataset. For example, if I imported a file called “cigarettes_per_year.xls”, and I specified a ‘member’ name of “cigs”, and the library “homework”, I would end up with a dataset called ‘cigs’ in my

Intro SAS

Page 5 of 10





‘homework’ library. I would refer to this dataset as homework.cigs in my SAS program. You will then be asked if you want to save a copy of the program which SAS is using to read in your data. In theory, there is no reason why you would want to see this program, but you can save the code if you are creating a temporary dataset and don’t want to go through the import wizard again Once you have finished with the Import Wizard, be sure to check the log file to make sure the data was imported successfully, and to check it visually. If the file was open in another program (eg. if you were checking over the data in excel before you imported it), the import will fail.

c) Other file types Both the Import Wizard and the ‘import’ procedure support the following types of files:  Dbase (.dbf)  Lotus (.wk1, .wk3, .wk4)  Excel (.xls)  Comma and tab-delimited files d) SAS Datasets with other extensions (e.g. .sd2, sd7) These files were created using older versions of SAS. If your directory contains only older versions of SAS datasets, then the libname statement, given above, should work. However, if your directory contains a mix of old and new SAS datasets (for example, if you have inherited datasets from people who worked on a project in previous years), you will need more than one libname statements for the same directory – you will need one for each type of SAS datafile. Suppose your thesis directory contains the datafiles ‘former_student.sd2’ and ‘newer_data.sas7bdat’: libname old v6 ‘c:\thesis\data’; libname new v8 ‘c:\thesis\data’; data OldData; set old.former_student; run; data NewData; set new.newer_data; run;

That is, you require two libname statements, both of which refer to the same Windows directory, The designations ‘v6’ and ‘v8’ are what SAS calls ‘engines’, and indicate the file extensions which are contained in a particular library. There is also a “v7” if you have files with the .sd7 extension.

Intro SAS

Page 6 of 10

In the above example, when you make new permanent datasets, be sure to save them using the libname ‘new’. For more information on reading datasets into SAS: http://www.ats.ucla.edu/stat/sas/library/SASRead_os.htm#ReadingDataInline EXPORTING YOUR SAS DATA You can export your SAS data into an Excel spreadsheet or other types of files, in a way which is analogous to importing data. SAS has an Export Wizard. Alternatively, you can use the ‘export’ procedure, which looks a lot like the import procedure. The ‘replace’ option will cause SAS to overwrite an existing file; if that’s not what you want, don’t include the ‘replace’ option. If SAS can tell from the filename extension in the ‘outfile’ statement what the format of the output file is, you don’t need to include the ‘dbms’ option. In the example which follows, the ‘dbms’ option is not required, because the .xls extension on the filename indicates the type of file to be created. proc export data = my_sas_data outfile = ‘c:\myfiles\fromsas.xls’ dbms = excel2000 replace; run;

MANIPULATING DATASETS Once you have created a dataset, you can issue commands which manipulate it. Type in the following command, and submit it. proc print data = temp; run;

Look in your log and output windows – you should see a print out of the entire dataset ‘temp’ that you had created earlier. Edit the program to introduce some mistakes. For example,  remove one of the parantheses in “sqrt(age)”, and resubmit the program. Check the log file.  remove one of the semicolons and see what happens. To only print certain variables, add the ‘var’ statement: proc print data = temp; var age sex; run;

To sort a data set by a variable in the data set, use the ‘sort’ procedure: proc sort data=temp; by age; run;

Intro SAS

Page 7 of 10

You can make changes to an existing dataset using the ‘set’ statement: /* make a change to an existing dataset: use of the 'set' statement */ data new; set temp; if age Export As Image…  Enter a name and path where you want the figure to be saved To erase all text from a Window (eg Log or Output window)  Click in the window that you want to clear  Push Ctrl+E HELPFUL WEBSITES: Helpful online SAS tutorials: http://www.ats.ucla.edu/stat/sas/ SAS v9.1 Online Help: http://support.sas.com/onlinedoc/913/docMainpage.jsp

Intro SAS

Page 10 of 10...


Similar Free PDFs