How to link the CRSP stock market database and the SEC’s EDGAR database correctly?
Authors: Alexander Hillert (Goethe University) and Michael Ungeheuer (Aalto University)
Title: How to link the CRSP stock market database and the SEC’s EDGAR database correctly?
Abstract: An increasing number of top accounting and finance publications use data from the Securities and Exchange Commission’s (SEC) Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system to analyze text-based variables (e.g., annual report tone and readability or actual share repurchases). In this paper, we show that linking the standard CRSP/Compustat stock and accounting data correctly with the SEC’s EDGAR database is challenging but at the same time essential for obtaining selection-bias free research results. More specifically, we document that the most popular linking approach – using Compustat’s linking table – results in a firm universe that is not only incomplete but also systematically different from the overall CRSP/Compustat stock universe. We develop a historical linking table that results in a comprehensive and representative stock universe.