Scraping the Federal Reserve With VisiData

I’ve been loving VisiData for years now. It’s my favorite way to poke around tabular data without the overhead of spreadsheets.

In 2020 I posted a short video demonstrating some quick data exploration you can do with it:

Today I appreciated another snappy use case: pulling and transforming HTML table data.

I learned that the Federal Reserve publishes a regularly-updated list of the largest commercial banks. It looks like this:

Federal Reserve Large Commercial Banks table

It also comes in PDF and ASCII formats. Unfortunately, the ASCII format would require bespoke parsing.

The good news is the HTML-formatted data uses good old <table> instead of <div> nonsense. Better yet, VisiData can work really well with HTML tables:

curl -s https://www.federalreserve.gov/releases/lbr/current/ | vd +:table_1::

which does the following:

Pull the page HTML with curl
Pipe it into VisiData
Select the second (zero-indexed) table

which presents this:

VisiData showing large bank data from the Federal Reserve

Note

You can often skip the curl step. VisiData can usually fetch HTML pages directly with the url passed as a positional argument (go try it on Wikipedia!). But the Federal Reserve page was 403’ing me. Rather than mess with VisiData User-Agent strings, I moved the problem to curl, which worked.

You can do fun visualizations in the terminal, like graph the log10 of the Consol Assets (Mil $) column, which looks like this:

VisiData plot of the common logarithm of large bank assets from the Federal Reserve

But what’s even more handy is you can just convert the entire data set directly to TSV (there are commas in the numeric data; CSVs are kinda bad) in one go:

curl -s https://www.federalreserve.gov/releases/lbr/current/ |
  vd +:table_1:: -b -o feddata.tsv

or SQLite:

curl -s https://www.federalreserve.gov/releases/lbr/current/ |
  vd +:table_1:: -b -o feddata.sqlite

or JSON:

curl -s https://www.federalreserve.gov/releases/lbr/current/ |
  vd +:table_1:: -b -o feddata.json

And just like that, you’ve got clean data of every big bank in the country.

Not bad for one pipe.