These days many api’s over HTTP that we query return JSON. That is great and all, it is much more readable for humans than XML.
What about when there is a large set of JSON and you only need a small part of it?
There are different ways of doing it, like converting the json to a dictionary and looping through everything and filtering the data as you go.
This is not the way to go in my opinion…we should be querying JSON the same way we query a database.
I have come across a few methods of querying json: YAQL
, JSONPath
, JMESPath
and will go through their applications in this post
YAQL
YAQL: Yet another query language can be used to query json and yaml.
Read the YAQL Docs here.
I think the Yaql project is closely aligned with stackstorm and their orquesta markup language.
It has an Online YAQL Evaluator, which is essential espescially when asking help from others.
JSONPath
Arose from the roots of Xpath to be used with HTML and XML – and having nothing similar for json.
Looks like it was created in 2007 by Stefan Goessnes, you can read the original docs.
More info on JSONPath
It has online json path evaluators: jsonpath and jsonpathfinder
JMESPath
I recently found JMESPath, via the ansible documentation on querying JSON on the filters page.
Here are some JMESPath examples.
JQ
JQ is a command line tool
Update: Jan 2020
Someone on twitter recommended glom a new approach for working with data in python
I will certainly give this a go in a bit…
Update: April 2022
There was a hackernews article on Benchmarks on python tools to query json query sets
The tools mentioned were:
Update: Nov 2023
A new tool called jaq which claims to be a clone of jq with a few focus areas: correctness, performance and simplicity.