Documentation CRSP Compustat v1 PDF

Title Documentation CRSP Compustat v1
Author Guan Eric
Course Special Topics In Business - Adv Tpc Invstmnt
Institution The University of British Columbia
Pages 2
File Size 64.9 KB
File Type PDF
Total Downloads 10
Total Views 144

Summary

Download Documentation CRSP Compustat v1 PDF


Description

Few words about CRSP and COMPUSTAT in general and how to use them together:

In this course you may need to get data for testing your trading strategy and UBC has an access to very useful databases by means of WRDS. You can access WRDS (https://wrdsweb.wharton.upenn.edu/wrds/index.cfm?) using the name and password provided on the course slides. WRDS provides access to different databases. CRSP and Compustat are the two databases, which can be of most interest for you right now. CRSP contains different data related to stock prices and trading of publicly listed (i.e. traded) firms. Compustat consists of accounting information for public firms (basically information you would find in financial statements). CRSP and Compustat contain two universe of companies and these two universes are not quite identical. Vaguely speaking, Compustat should contain (almost) all firms in CRSP but CRSP does not contain all firms in Compustat. One of the most important ‘tool’ of each database is the identifier of the entities it contains and unfortunately CRSP and Compustat use different identifiers. CRSP is security oriented. It means that it provides information on prices of securities (stocks…). You may be familiar with the ticker as a very common identifier of a security. For instance, ticker for Miscrosoft’s stock is MSFT and for Apple’s stock AAPL. However in CRSP, the most commonly used identifier is called PERMNO. PERMNO is a unique number assigned to each security in the database. Accordingly, each security has its own PERMNO and even if the security does not exist anymore (e.g. if the company went bankrupt) this specific PERMNO is still assigned to this specific security and will not be used anymore for anything else (in contrast a ticker may be used again for different stock). Other identifiers in CRSP are ticker (I just mentioned its disadvantage), cusip, PERMCO (it is company identifiers, so if one company has two different stocks (e.g. common and preferred) then each of the stocks has the same PERMCO but different PERMNO). However, you may find it most comfortable to work with PERMNO. Compustat has different identifiers. The most widely used one is called GVKEY. This identifier is company specific. So each company in Compustat has its own GVKEY no matter whether it already went bankrupt or still exists. Sometimes, we want to use information from both databases. For instance, we want to use some accounting information such as book value of equity, cash flow, or net income to screen (sort) companies. Then we want to have a look how these companies perform on stock market. So, we need to combine information from Compustat with information from CRSP. In other words, we need to assign each security from CRSP the corresponding firm in Compustat (or to the firms in Compustat we need to assign the corresponding securities). So, we need to link the PERMNOs to GVKEYs. How can we do that? In WRDS now offers so called CCM database (CRSP/Compustat Merged database). In WRDS, go to Get Data (All Data), CRSP, CRSP/Compustat Merged, Fundamentals Annual. You will come to the data selection page very similar to Compustat.

In Step 2 (Apply your company codes), you can selects which identifier you want to use. The big difference is that you can select LPERMNO, which is our PERMNO identifier from CRSP. In Step 3 (Linking Options), you can select which types of links between the two databases you want to include in your choice (LC and LU together are the most common ones). This may sound a little bit abstract, but there is some methodology how these two databases are linked. It is useful to have a look at this methodology a little bit at least. Some information is provided in WRDS (https://wrdsweb.wharton.upenn.edu/wrds/support/Data/_001Manuals%20and%20Overviews/_002CRSP/ccmoverview.cfm) Very important, there are three options below the window for ‘Fiscal Periods and Link Date Requirements’. This specifies what part of the fiscal periods should be within the link date range. In Step 4 (if you skip the firs window) there is the choice of different variables you want to get (very similar to the one for Compustat). It has several tabs (Search All, Link Information, Identifying Information,…, Balance Sheet Items, Income Statement and, so on). So you can get all the data you would get from Compustat. But for our purposes right now, click on ‘Link Information’ tab and you will see various information about how data between the databases is linked, e.g. First Effective Date of Link, Last Effective Data of Link. These are useful since they give us the time period for which the particular data from CRSP is linked to data in Compustat. For instance, one PERMNO (i.e. one particular security in CRSP) might be linked with a particular company in Compustat starting from Jan 1, 1980, and ending Dec 31, 2005. Finally you can download the data (e.g. in csv) in the same way as you would do that for Compustat. The advantage is that now you have the Compustat accounting information together with the PERMNOs, which you can use to link with CRSP. Provided that you have one dataframe with CRSP data and other dataframe with CCM data (including PERMNOs), you can use the merging command in python to merge those two together. The general command is df1.merge(df2, left_on=[PERMNO, and maybe year], right_on=[LPERMNO, and maybe year], how=’inner’). The reason why I mention that you want to merge not only by PERMNO but also by year is that you want to connect the accounting information from the relevant year with the stock price (return) information. Think more about this since it is not trivial. You should think about what kind of information is available to the investors in the real time. Usually only the accounting information from the previous year (or quarter). So it might be advisable to link CRSP returns with CCM data from previous year. (One good guidance for this is in the Fama and French papers 1992 and 1993. They explicitly describe how much time lag they allow for the accounting data they use to make sure that the data is publicly available). Also please check the description of the .merge command, especially to understand what the ‘how’ parameters means and what else you can use there (outer, …). This gives you an idea about CRSP and Compustat. UBC has other databases that you might consider using for your project (e.g., IBES for analyst forecasts), and you may find some other data that you would like to use. In any case, you will have to have a general strategy for linking together databases, as shown in the Investment Strategy in class....


Similar Free PDFs