How truthful are TidyTuesday contributions?

Hello there!

Welcome to this week's edition of my newsletter. Every other week, I share a few thoughts and ideas about data visualization, statistics and web app development. Here's an overview of what I will share with you today:

  1. How truthful are TidyTuesday contributions?

  2. Are food plots always foul?

  3. Unstacking bar charts and visualizing time data with calendars

How truthful are TidyTuesday contributions?

You have likely encountered many well-meaning but not necessarily informed "infographics" throughout the pandemic. Good. Let this be a valuable lesson.

It is a good reminder that you're not automatically an expert on data that you've visualized. That's why getting insights from an expert in the field is so valuable. Often, though, expert insights are not easily available.

For example, I like to try out new plots with data from the weekly tidyTuesday challenge. Yet, if the data of the week is about a sensitive topic, I am not so sure whether it's a good idea to practice on that data.

Let's make this more specific.

Two weeks ago, I practiced creating gauge plots and a custom legend. The underlying data was featured on tidyTuesday and was about the gender pay gap. Here's what I built.

Albert's tidyTuesday contribution on the UK pay gap

I don't want to brag (too much) but I am satisfied with how this viz turned out visually. But really, I practiced ONLY the visual aspect of dataviz. The data part about checking the source or running an elaborate statistical analysis was left unchecked. That's because my focus was on practicing new visual elements.

In hindsight though, I wonder if I should have added a warning label. I'm thinking something along the lines of "This viz was created for training purposes and is not intended for rigorous statistical conclusions". At least for a sensitive topic like gender pay gap, this may be prudent.

After all, I don't want people to think my viz is well-meaning but uninformed. Maybe, such a warning is me being overcautious though. I still haven't made up my mind about what's the "right" thing to do.

And it appears like there is no clear answer anyway. The last Twitter poll I ran about whether to add a warning resulted in replies that were close to 50-50 (admittedly, the sample size was small). If you wish, you can add your vote to this poll by replying to this mail.

Are food plots always foul?

In the dataviz community, pie charts and spaghetti plots are disliked. A lot. Sure, they can be bad. But are they always? Let's have a look at an example I found on Twitter which made fun of the the fact that pie charts were used.

I agree that pie charts are a terrible idea when you have many or similarly-sized categories that you want to know the exact size of. If that's the case, it is hard for readers to compare the angles of the pie slices and you should probably use a different chart type.

But for this particular visual, comparing slices is not a problem. In fact, it is easy to see how the proportions behave over time. So, the pies delivered the intended message quite well.

Of course, this visual didn't exactly show "true data". So, I dug into the internet to find more serious inputs on pie charts. Luckily, Datawrapper put together a great post about how to work with pie charts. You can find it on their website and you should check it out. Enjoy the read.

As for spaghetti plots, I agree that they can be bad. Actually, one of my most popular blog posts shows you how to improve them. But sometimes a lot of messy lines can form patterns. And THAT can be a valuable insight.

For instance, check out this TidyTuesday contribution by Georgios Karamanis. Although it contains many, many lines, it didn't fail to generate insights.

What are other good examples of pie charts and spaghetti plots? Feel free to share your examples with me if you get a chance. You can easily reach out to me by replying to this mail.

Unstacking bar charts and visualizing time data with calendars

I am currently reading Jonathan Schwabish's "Better Data Visualizations" book and it is amazing. It contains many nuggets of dataviz wisdom. Here are just two inspirations I drew from the book.

  1. Calendar plots: These are a neat way to show changes over time (with calendars).

  2. Unstacked bar plots: Instead of stacking bars, each group is given its own small window. Also, an additional window for the the sum of the stacked bars is added. This makes comparing groups sooo much easier. This is not to say that stacked bar plots are generally bad. But depending on what your message is, you may want to unstack your bars.

Both types of plots are quite easily implemented with ggplot. On Twitter, I've documented how. You can find the tweet on calendar plots here and the tweet on bar plots here.

That's it for today. As always, I'm eager to hear your feedback. So, don't be shy. Also, this newsletter is still in its infancy. Thus, if you liked what you've read, I'd be happy if you spread the word.

Of course, share this mail only if you're comfortable with that. I am the last person that wants to create spam. And if you've been forwarded this mail, think about signing up for my newsletter here.

Enjoy the rest of your day and see you next time!Albert

Reply

or to participate.