3MW (Safer SQL in R)

Enjoy 3 Minute Wednesdays at no cost.

This newsletter is brought to you for free. If you want to advance your R skills and support my work at the same time, then you will like my paid offerings:

Or use my affiliate code “RAPP10” to get 10% off the fantastic data science courses from Athlyticz:

Guten Tag!

Many greetings from Munich, Germany. In the last couple of newsletters, we talked about SQL databases. And there’s no way we can talk about accessing SQL databases via R code without mentioning SQL injection.

This is a nasty thing that can leak data or even destroy your database if you allow people to interact with it unsafely. And often, that’s exactly what you want (the interaction, not the unsafe part 😆). Here’s a common scenario for that:

  • Users interact with a Shiny dashboard to retrieve specific data from a database.

  • Based on the inputs, you execute some SQL.

  • Your dashboard displays the new data.

So letting users interact with a database is not bad in itself. It’s only bad when you do it uncontrolled. That’s when this bad SQL injection can occur.

But here’s the good news: There’s an easy way to avoid damage. And even better: This method also makes it much easier to assemble SQL code in R.

Let me show you how that works.

SQL Injection via R Variables

Let’s talk about how you might encounter SQL injection in the first place.

Let’s imagine that you want to retrieve data about a specific penguin species from a table in your database. One common (but unsafe) way to do this is to paste together SQL code manually with R variables:

While this works, it’s insecure. If a user-supplied value like species contains malicious SQL code, it could complete your SQL snippet and do something dangerous (e.g. extracting all of your confidential data). This is SQL injection.

And in this specific scenario, here’s what that would look like. Let’s first set up our data base from last time:

And then we can stick in some malicious code into the species variable.

Here, a little trick was used to close the “WHERE” statement with a ' followed by an or-statement that is always true. That way, we will never actually filter the data.

Why Would That Happen?

Now, you might wonder: Why would my variable ever be filled with something malicious?

Well, consider a Shiny app that allows users to filter database results. If you’re not careful, users can input harmful values into the fields that populate those SQL variables. If someone with bad intentions figures out you’re pasting user input into SQL strings, they could inject destructive SQL.

And they don’t even have to do that manually. There are automated tools that try out loads of combinations in the hopes that at some point some SQL injection works.

The Solution: glue_sql() from {glue}

Here’s where {glue} and its glue_sql() function come in. Using glue_sql() makes assembling SQL both easier and safer. You still write your SQL code in text form, but you use curly braces ({}) to insert variables.

Most importantly, glue_sql() handles SQL escaping for you to prevent SQL injection. Here’s an example:

So in the standard case everything works like you’d expect. And in case there’s the same malicious code from before inside the species variable?

Here, glue_sql() will put everything into quotes. That way, a user cannot stick in something to close the initial quotation mark. And the resuls would be empty:

Thus, this is easy to write and secure.

Fantastic! We’ve just seen how glue_sql() helps you write safer SQL code. But there’s more! In the next newsletter, we’ll explore additional features of glue_sql() that make it even more powerful.

And as always, if you have any questions, or just want to reach out, feel free to contact me by replying to this mail or finding me on LinkedIn or on Bluesky.

See you next week,
Albert 👋

Enjoyed this newsletter? Here are other ways I can help you:

Reply

or to participate.