Last Updated: February 25, 2016

Query JSON documents with jq

If you're working with JSON, you might know the problem: The JSON document is minified and you're only interested in some parts of the document. Take the revisions of the Wikipedia Main Page as an example: By using the Mediawiki API, that information in quickly available, but the document is rather confusing.

jq can help you with both problems: It pretty-prints the document and helps you to extract only the needed information. To install jq, you can use brew install jq on OSX and there are binaries for Linux.

Some tricks to get started with jq:

Pretty print the document

curl -s 'http://en.wikipedia.org/w/api.php'                      \
     -d 'format=json' -d 'action=query' -d 'titles=Main%20Page'  \
     -d 'prop=revisions' -d 'rvlimit=10'                         \
| jq '.'

Have a look at the top level keys

curl -s 'http://en.wikipedia.org/w/api.php'                      \
     -d 'format=json' -d 'action=query' -d 'titles=Main%20Page'  \
     -d 'prop=revisions' -d 'rvlimit=10'                         \
| jq '. | keys'

Extract comment and timestamp of each revision

curl -s 'http://en.wikipedia.org/w/api.php'                      \
     -d 'format=json' -d 'action=query' -d 'titles=Main%20Page'  \
     -d 'prop=revisions' -d 'rvlimit=10'                         \
| jq '.query.pages["15580374"].revisions[] | {comment, timestamp}'

There is an excellent tutorial and a great manual for you to read.

Happy jq-ing!