May cohort is now open: How to secure your spot:

Accessing Financial Data In EDGAR using Python

Accessing Financial Data in EDGAR using Python

Accessing Financial Data in EDGAR using Python

Some financial data sources can be costly or difficult to locate. Some are only available through exchanges or vendors. Most times, these vendors charge a high fee to access it. But the U.S. SEC makes data of financial industries available for free. 

The mission of the U.S. SEC is to protect investors and to ensure that markets are fair and orderly.

We encourage you to provide accurate information about your business to the SEC. If you don’t, you’ll have to pay fines or face criminal prosecution. They’ll need you to input your information in a certain form. These data entered in the form are “filings”. You can access the filing data for free via the SEC’s EDGAR system. This article will describe how to access EDGAR filings making use of Python.

Poking around EDGAR

You can type in the name of a company in the main search page of a web browser. It will give you results (related filings) for that company.

Each filing has a human readable document and a link to the actual filing data. 

Accessing Data using EDGAR

To be able to access data on EDGAR sites, there are some rules to follow and they are:

  • You have to limit the requests-per-second to 10.
  • For your HTTP headers, you have to provide a company name and administrative email.
  • Your user agent must support compressed content using gzip or deflate.

The Central Index Key (CIK) 

Central Index Key (CIK) is the unique identifier that SEC assigns to each filer. Each filing will be under a CIK.

Recent Filings 

Once you find the CIK, you can then request all the recent filings for the particular CIK. When you search for submissions for a CIK, you get a JSON document. This document will have a lot of useful information.

The final thing to do is to convert and extract . The raw filings data can be messy. Anyways, with a bit of Python code, you can convert into a pandas DataFrame and extract what you need.

Read more at wrighters.io.