Closed Captioning Closed captioning available on our YouTube channel

How to never look up tidyr pivot_wider and pivot_longer again

InfoWorld | Oct 21, 2021

Tidyr is a handy R package for reshaping data, but it can be hard to remember exactly how pivot_wider and pivot_longer work. Thanks to RStudio code snippets, you can write a snippet once and then always have a fill-in-the-blank code template at your fingertips!

Copyright © 2021 IDG Communications, Inc.

Similar
Hi, I’m Sharon Machlis at IDG, here with Episode 65 of Do More With R: Never have to look up how tidyr’s pivot_wider and pivot_longer functions work again!
A lot of tidyverse users turn to the tidyr package for reshaping data. But I’ve seen people say they can’t remember exactly how its pivot_wider() and pivot_longer() functions work. Luckily, there’s an easy answer: RStudio code snippets! Write a snippet once, and what’s basically a fill-in-the-blank form will always be at your fingertips. Let’s take a look!
I’ll start with going from wide to long.
To go from wide-to-long (or wide-to-tidy), you’d use the pivot_longer() function. First argument is your data frame, then you need to set up other arguments. The most important ones are cols, the names of the columns you want to pivot longer; names_to, the name you want for the single new category column; and values_to, the name you want for the single new value column. cols follows the tidyverse convention of not putting existing column names in quotation marks. The names of the new columns are quoted character strings, though. That’s because they’re not existing column variables.
Will you remember all that? Great! If not . . . that’s what code snippets are for. Let me demo my snippet in action. >. I’ll start with the old mtcars data set. It doesn’t have a category column, so I’ll use the tibble package’s handy rownames_to_column() function to turn the row names into a new column called “Model”.
If I want this in “tidy” or long format, all the columns starting from mpg to the last one should be pivoted longer. To create that mtcars_long data frame, I want to pivot_longer(). I created a snippet I called plonger. If I start typing plonger, my snippet’s name appears as a choice and I can select & use it.
Do you see what happened? I’ve got explainer code here. And, it’s also fill-in-the-blanks. My cursor is on the first fill-in part, so I type in the name of the data frame (mtcars) and hit the tab key. Next I select all the columns I want to pivot. Fortunately, I can use dplyr’s select() syntax instead of naming every column. So, I can type first column name, colon, last column name if the ones I’m selecting are consecutive. The next 2 are easy – the names I want for my new columns. The quotation marks are already there in my snippet - I didn’t have to remember them. Now I’ll run this code Voila! tidy data.
This is the snippet code. The usethis package’s edit_rstudio_snippets() function opens your snippet file for editing. All the code is in the InfoWorld article associated with this video (if you’re viewing on YouTube, the link is below). If you’ve never used RStudio snippets before, check out my tutorial – also linked to below the video on YouTube.
Next up: long to wide and the pivot_wider() function. Here are some of its most important arguments: You start with the data frame.
id_cols is optional – a vector of all the columns you don’t want to pivot. (If you don’t define that, pivot_wider() assumes that’s “everything you didn’t otherwise mention”.) names_from are the columns that you want to go from long to wide. Each value in the pivoted column turns into its own column. Like in the original, wide mtcars data: Each category like mpg and carb was its own column. values_from are the columns that contain data which also need to pivot wide. names_sep is optional – if you end up with compound column names, it’s what you want as the character separating the 2 strings. This will make more sense when you see the code.
The us_rent_income data set is long. It’s got columns for the GEOID, state NAME, variable of income and rent, the estimated value, and the margin of error. If I want a more human readable & sortable version, I’d want income and rent to each have their own columns. For data, it would be helpful to have both the estimated value and the margin of error. Let’s use my 2nd snippet, pwider. Once again, sample code with fill-in-the-blanks. I’ll type us_rent_income for the data frame, skip the optional id_cols, variable as my category column, and a vector with both estimate and m.o.e. as my value columns. Now let’s run the code. And there’s a wide data frame with columns for estimated income, estimated rent, income margin of error, and rent margin of error. Once again you can see the snippet code in the related InfoWorld article, if you don’t feel like pausing the video and copying the code manually.
But to recap: usethis::edit_rstudio_snippets() to open your snippets file. To use a snippet, you start typing the snippet name, select it, and then hit tab if the snippet includes fill-in-the-form type variables. And, there’s the code for both snippets. Note that all the lines under the snippet name line MUST start with a tab.

That’s it for this episode, thanks for watching! For more R tips, head to the Do More With R page at bit-dot-l-y slash do more with R, all lowercase except for the R. You can also find the Do More With R playlist on YouTube’s IDG Tech Talk channel where you can subscribe so you never miss an episode. Hope to see you next time. Stay healthy and safe, everyone!
Popular
Featured videos from IDG.tv