- 3 Minutes Wednesdays
- Posts
- Loading Excel files with missing numbers into R
Loading Excel files with missing numbers into R
Guten Tag! đź‘‹
Many greetings from Munich, Germany. Yesterday, I’ve shown you how easy it can be to turn weird notations like NA
, NULL
or -9999
into actual missing values when you read a csv
-file. With Excel files, things work similar but a tiny bit different.
Let’s try to read an Excel file with the {openxlsx}
package.
Notice that this output doesn’t show us the data formats. That’s because this is a data.frame
instead of a tibble
. For a nicer output, it’s convenient to make this into one.
So here, we can now see that things are once again treated like characters. Luckily for us, the read.xlsx()
function has an argument called na.strings
that works pretty much the same as the na
argument from read_csv()
.
This did half of the work for us. The output now correctly turns NULL
and -9999
into NA
as well. But the column’s data format? It’s still character
.
Thus, we have to manually convert this from character
to double
. Here, we can do that with parse_number()
.
For every data type (in particular datetimes) there are helpful functions like parse_number()
to do the conversion for us. For dates & times, there’s actually quite a few due to the variety of that format. If you’re curious, you can level up your data cleaning skills and find out more in Part 2 of the Data Cleaning Master Class. 👇️
And don’t forget: The 15%-off promo code “PART2RELEASE” is still available for 5 days.
Happy to have you onboard,
Albert
Reply