JSON: Fun for Everyone!
Published on December 14, 2017Friends! Settle in, I've got a good one for you today. I want to talk to you all about my favourite data-interchange format! What's that? Nobody has a favourite data-interchange format? Nonsense! Everyone has one. Mine is JSON. Heard of it? Yeah, I thought so.
It's everywhere in the web world. I'm not going to go into the details of what JSON is in this post, or what it stands for, or where it comes from, but if you're interested in that information, check out www.json.org. What I want to talk about here is some tricks to handling large amounts of JSON in a sane, speedy fashion. Because nobody likes crashing their text editor trying to open a 400mb JSON file holding 85,000 records with upwards of 20 fields each, because they're trying to find out the lowest value of one particular field. Yeah, definitely did that to myself at least once. It took an embarrassingly long time until I realized that there must be a better way, but once I found it, hoooo boy. Things improved, let me tell ya.
So. Down to brass tacks. The first thing I'm gonna need to you to do, is go
ahead and download a fun little command-line program called jq
. You can read
about it at https://stedolan.github.io/jq/. If
you're using a Mac, you can install it with Homebrew. Otherwise, there's all
kinds of fun instructions if you follow that link. jq
, if you haven't heard of
it, is a command-line JSON parser. It's written in C, and it's super powerful,
in the right hands. In the wrong hands, well, luckily for us, it won't do much.
At it's most basic, it'll take a single-line JSON string and pretty-print it for
you.
This:
{"name": "Mike", "age": 27, "occupation": "developer", "likesPizza": true}
Turns into this:
{
"name": "Mike",
"age": 27,
"occupation": "developer",
"likesPizza": true
}
Not a huge deal when you're dealing with such a short string, but the real power comes out when you start operating on a bigger dataset.
Take, for example, the aforementioned 400mb, 85,000 record JSON file. Try
opening that in any text editor and you'll have a bad time. Even worse, try
searching for a particular value within that text editor. It's trouble. Trust
me, I've tried. This is where jq
really shines. Let's say that you want to
pull out the most recent timestamp from a record in that file. Print the file
with cat
, pipe the output to jq
, and provide a complex-looking set of
arguments. Ta-Da! You've got the value. The full command looks like this:
cat ~/path/to/file.json | jq '[.[]."timestamp" | strptime("%m/%d/%Y") | todate] | sort | last'
Without going into too much detail, the arguments passed to jq
allow you to
access particular fields on each object, parse them to a standard timestamp
format, sort them, and take the last. Operating on the 400mb file, it takes
about a second to run that command. Pretty impressive, if you ask me.
Another recent use case I found is to return records that match a particular filter. Say you want to pull all records matching a particular name.
cat ~/path/to/file.json | jq '.[] | select(.name == "mike")'
Pretty neat huh?
Other Fun JQ Stuff As some of you may know, I'm a fan of Vim (well,
specifically Neovim) for the vast majority of my code editing needs. Vim
integrates super well with command line tooling like jq
, so I've got a few
useful little custom vim functions to manipulate JSON in a sane way. I like to
have the ability to minify or prettify my JSON files as I'm workin, and in Vim,
invoking jq
from within the editor is dead easy. I've got two short VimScript
functions that I've mapped to easy shortcuts.
function PrettyPrintJSON()
:%!jq '.' -M
endfunction
function! MinifyJSON()
:%!jq '.' -cM
endfunction
I know I'm not the only person to do this, and I think I found this post, which I used as inspiration for my functions, but feel free to shamelessly copy these and make them your own.
That's all I've got for the moment! Happy data-interchanging!