The Ultimate Interactive JQ Guide (original) (raw)

Cover Photo by Pixabay

Has this ever happened to you?

You’ve just received a massive JSON file that looks like it was designed to confuse you. Or maybe you entered a command, and you got so much JSON that it looks incomprehensible.

Important: Level up your jq skills with this accompanying FREE practical project on DataWars — enhance your skills through real-world challenges!

I’ve partnered with DataWars to include Hands On Practical Projects that complement this jq guide, designed to deepen your understanding and enhance your skills in a real-world setting.

The first project, Pokemon Data Analysis is available now.

Click Here to Start Practicing for FREE and LEVEL UP your jq skills

The data you need is buried inside, and you’re dreading the hours it’ll take to extract and clean it up.

I’ve been there. I’ve grepped my way through JSON and written ad-hoc Python scripts to process it for me.

But things don’t have to be like this.

Introduction

jq is one of the best-kept secrets in the data processing world.

Here are some scenarios where jq could swoop in to save your day (and saves mine regularly):

  1. Integrating with APIs in shell scripts often means handling JSON responses, requiring data extraction and manipulation.
  2. Data from different sources may need to be converted to or from JSON format for compatibility.
  3. Managing software configuration files in JSON format can be a regular task.
  4. Extracting data from websites often results in dealing with JSON data that requires parsing and filtering.
  5. Server logs and monitoring data often use JSON, necessitating parsing and analysis.
  6. Infra as Code tools like Ansible and Terraform use JSON-like configurations, requiring management. JSON is a subset of YAML, so every valid JSON file is also a valid YAML file.

All examples are ✨fully interactive✨, so I encourage you to play around!
In fact, I’ll be downright heartbroken if you don’t, because I put a lot of effort into it. You can edit both the input JSON data, and the jq program as well.

Let’s dive in! We’ll start off easy, and get slowly deeper into the weeds.

Basic Operations

Selecting values

Everything in jq is a filter. The dot . is used to select the current object or element, and we can put the property name after it to access a key from an object:

Filtering Arrays

The .[] notation is used to iterate over the elements of an array in a JSON document. It allows you to access each element of an array and perform operations on them.

The select() function is used to filter JSON data based on a specified condition or criteria. It is a powerful tool for extracting specific elements from a JSON document that meet certain conditions.

Similiar to shell scripting, jq works on a pipes-and-filters manner. We use the | to send the data from one filter to the next.

Mapping Arrays

We can use the map function to run any operation on every element of the array and return a new array containing the outputs of that operation:

Combining Filters

The pipe operator | can be used to chain as many filters or functions as we want:

Splitting Strings

We can use the split() function to a split a string on a particular separator character.
Note also the usage of .[0] to select the first index from the split array.

Conditional Logic

We can use if to create expressions

Handling Null Values

Null values can often mess up logic in our scripts, so we can filter them all out using map and select

Formatting Output

Sometimes we don’t want JSON output. We want it in a particular string format.
Note the use of the -r flag, it makes the output raw. Without it, it would be displayed with quote marks around it.

Multiple Outputs

Curly braces create a new object, which we can use for multiple outputs:

Dealing with Nested Items

Photo by cottonbro studio

JSON is very commonly used to store nested objects, and we often need to traverse or manipulate such structures. jq gives us all the tools we need to make it easy:

Important: Itching to practice what you’ve learned? Check out this FREE practical project on DataWars

I’ve partnered with DataWars to include Hands On Practical Projects that complement this jq guide, designed to deepen your understanding and enhance your skills in a real-world setting.

Click Here to Start Practicing for FREE right now

Recursive Descent

We can use .. to recursively descend through a tree of an object.

Filtering Nested Arrays

Flattening Nested JSON Objects

Often, we just want all the key-values, and flattening the object may be the most convenient way to go.

This is an example where the operation we want to do is fairly straightforward, but the program looks way too scary.

Let’s try to break it down:

Recursive Object Manipulation

We can use the recurse as well, to traverse a tree.

Complex Object Transformation

Walk through object and apply a transformation conditionally

The walk() function provides a convenient way to traverse a nested object and apply some transformation to it.

Statistical Operations

Photo by Leeloo Thefirst

jq is incredibly handy for doing quick and dirty statistical analysis in the field. Here’s most of the common operations related to that

Sorting Arrays

Sorting an array is a basic operation that is useful for many things in statistics.

Extracting unique values from an array is another fairly basic operation that we need for many things.

Calculating Averages

Calculating the mean or average of a dataset is a common statistical operation we may often need to do

Grouping and Aggregating

We can group an array of objects by a particular key and get an aggregated value of the other keys fairly easily:

Filtering after Aggregation

Custom Aggregation with reduce

We can also use reduce to perform a single-output aggregation from an array

Calculating Histogram Bins

We may want to calculate a histogram from an array of data.

Other Common Operations

These are some other common operations I frequently find myself doing every day, but I couldn’t think of a better way to categorize them.

We can combine multiple conditions in a select call. The test() function is used to check if the passed string contains one of the substrings or not.

Formatting Unix Timestamps

Various tools emit Unix Timestamps, and we can use the handy strftime function to format it so it’s easier to understand at a glace.

Enumerating by Top Level Key and Value

Closing Thoughts

Whew! That’s been a long article 😅 If you’re still here, then I appreciate you staying till the very end.

I hope you’ve learned something new, and that you’ll be able to quickly identify use cases for jq in your current workflow and apply your learnings there.

Important: Enjoying this article so far? Check out this FREE practical project on DataWars

How Does This Article Work?

Get In Touch

If you have any suggestions on how this may be improved, errors that I might have made, or you just want to discuss any other topic, please feel free to email me. I always love to hear from you.

Changelog

I’m running an experiment for better content recommendations. These are the 3 posts that are most likely to be interesting for you: