3MW (Use colors strategically and highlight your insights)

Guten Tag!

Many greetings from Ulm, Germany. Today, I’m going to show you one way to use colors strategically. So instead of creating a spaghetti plot like this where all the different lines get a distinct color…

…I’m going to show you the key ingredient to create a line chart like this. Spoiler: It’s really, really easy to pull off.

And don’t get me wrong. Showing all the data like in the first plot is absolutely important. After all, you don’t want to hide the data. But using colors to highlight certain aspects of your data allows you to tell your data story in the context of all the (greyed out) data.

Obligatory reminder: All of the code can be found on GitHub.

A basic plot to modify

Those of you who are also watching my YT videos, will probably have noticed that the previous charts came from my last video. This means that we can focus on the highlighting mechanics in ggplot on a simple chart.

And if you want to go all the way and create the above charts, you will find this fully documented on YouTube. Thus, let us create a simple chart like this.

Here, you see a histogram that shows the distribution of penguin weights from the {palmerpenguins} package. This is the first plot we want to modify. The code for this was pretty straightforward.

Use window for species

In a histogram like this, it is a bit annoying that the bars overlap so much for different species. That makes it hard to concentrate on one species and I even had to lower the transparency value alpha so that we can see anything at all. One way to fix that is to give each species its own window.

This is neat but it comes with a crucial drawback: It’s hard to compare the weights across species because each window shows only one species. So in each window, there’s a lot of missing context.

Use windows AND show all the data

In order to give each species it’s own window and show the other data for context, we can simply use the {gghighlight} package. You wouldn’t believe how easy it is to use this package. After we’ve loaded it, we just have to add a gghighlight() layer to the plot and it will do its magic.

Specify the highlighted part

Due to the facetting, gghighlight() automatically understood that you want to show all the data but you also want to highlight the corresponding species in each window.

Normally, you will have to specify what parts of your chart you want to highlight. And the way to do that is to give conditions for the highlighted parts. Just like in a filter() call.

Here, this will highlight only parts of the data in each window.

Highlight lines

And the same idea works with other chart types too. Remember our spaghetti plot from the beginning? By adding a gghighlight() layer we can create the desired highlighting.

This way, we get a chart like this.

Fine-tuning your highlights

You may have noticed two things

  1. You get a weird grouping error message when you run the code

  2. The legend disappeared and small direct labels appeared instead.

By setting use_group_by and use_direct_label to FALSE like so….

… you will get rid of the annoying error message and will bring back the legend.

Finally, you can set the aesthetics of the unhighlighted parts via a list that you pass to unhighlighted_params.

So that’s how you can highlight certain aspects of your chart. Of course, you can always fine-tune the theme() of the plot to create the chart from above. My YT video will show you how.

That’s it for today. Hope you’ve enjoyed this week’s newsletter. If you want to reach out to me, just reply to this mail or find me on Twitter.

See next week,
Albert 👋

If you like my content, you may also enjoy these:

Reply

or to participate.