- 3 Minutes Wednesdays
- Posts
- 3MW (Automatic Time Series Collection With {pins} & GH Actions)
3MW (Automatic Time Series Collection With {pins} & GH Actions)
Enjoy 3 Minute Wednesdays at no cost.
This newsletter is brought to you for free. If you want to advance your R skills and support my work at the same time, then you will like my paid offerings:
Guten Tag!
Many greetings from Munich, Germany. In today’s newsletter, I want to show you how you can combine the {pins}
package with GitHub Actions. That way, you can automate the process of downloading data from the internet and saving it as a time series.
The workflow is pretty similar to one of our previous newsletter on GitHub Actions. So if you’ve read that newsletter, then you should have no problem following along.
A repo for our project
The first thing we need is a new project for our data extraction endeavor. But let’s not start completely from scratch. Let’s use our “gh_actions_test” repo. It already has an {renv}
environment for version control set up. That way, we don’t have to worry about that.

Add pins to {renv}
Now we can add {pins}
to our {renv}
environment. Simply call the install()
command.

Then, we can create a new R-Script pins-demo.R
in which we’ll actually use {pins}
. For now, this should only contain this code:

That way, {renv}
lets us write the {pins}
package to its lock file. If you recall, this is the thing we have to restore at the beginning of every GH workflow. Now make sure that
you can execute this code in your project, i.e. you have AWS credentials set,
that you have your
.Rprofile
file removed from the git repo and added it to.gitignore
(because you don’t want to share your keys),and that you saved the packages to the lock file via
renv::snapshot()
.
Alright, cool. So with that we need a new GH actions workflow file.
Base GitHub Actions Workflow
Let’s start out with .yml
workflow file from our last GitHub Actions tutorial. That’s a good place to start I’d say.

In this project, we don’t need anything from Quarto and we also don’t need to commit to Github, so we can remove those steps.

Unfortunately, R isn’t contained in the ubuntu-latest
machine from GitHub anymore. That’s why we have to add R ourselves. But that’s super easy to do. We only have to use one of the R containers provided from the Rocker Project.

And with that we are once again ready to use R automations.
Extract Data
Next, we need an R function that fetches data from an external source. This could be via
API calls,
web scrapings, or
any other data retrieval method.
To keep things simple, we’ll simulate it with a function that just returns a random value:

Check if pin is available
In the next step, we want to write this data to a pin that is stored in our board in S3. For that, we first need to check if our pin is even available. If not, we simply write to that pin without doing anything else.

Grab previous data
If the pin is already available, then we should
grab the already saved data,
append our new data to it, and
write that new complete data to the same pin.

And that’s it. {pins}
will do all the data versioning for us. So overall, our pins-demo.R
script is only a short little file like this:

Modify GH workflow
Sweet. Now, we only have to tell our GH Actions workflow to run that script. The easiest way is to add another step after all dependencies are installed.

Add AWS Credentials to GitHub Secrets
But if you think about it, this cannot work yet. The machine that is supposed to execute this code, doesn’t know our GitHub credentials. If this script works on the GH action, then you should seriously check if you’ve accidentally pushed your .Rprofile
file (which you should absolutely not do.)
So that’s why we have to tell our step about the environment variables it needs.

And then in your GitHub repository, go to Settings > Secrets and variables > Actions, and add:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
And that’s it! Now every time the flow runs, we will have added a new row to our dataset.

You now have a working automation that pulls data from anywhere you like and stores it in cloud storage securely and automatically.
As always, if you have any questions, or just want to reach out, feel free to contact me by replying to this mail or finding me on LinkedIn or on Bluesky.
See you next week,
Albert 👋
Enjoyed this newsletter? Here are other ways I can help you:
Reply