Closed Captioning Closed captioning available on our YouTube channel

R tip: Drag-and-drop ggplot

InfoWorld | Oct 12, 2018

See how the new graphical user interface for ggplot2 works, thanks to the esquisse R package

Copyright © 2018 IDG Communications, Inc.

Similar
Hi, I’m Sharon Machlis, Director of Editorial Data & Analytics at IDG Communications. I’m here with Episode 13 of Do More With R: Drag-and-drop ggplot.
Some R users become a little leery of graphical user interfaces. Pointing and clicking and dragging may be convenient, but it can be harder to save or check or re-run an analysis.
But I think even most hard-core command-line junkies would agree that a drag-and-drop interface can be helpful forsome exploratory dataviz.
That’s what the new R package esquisse brings to ggplot2. It gives the best of both worlds: drag-and-drop, plus generating basic ggplot code for the graphs you create. And, it’s pretty cool! esquisse was created by 2 people at a French R consulting firm, dreamRs. The name esquisse is French for sketch.
Let’s take a look at the package.
I’ll use one of my favorite types of data sets, airline flight performance. Here I’ve prepped 2 data frames – flights from Boston to Austin, Texas in January 2018, and flights back from Austin to Boston the same month. I’m flying to Austin this coming January 2019 for the RStudio conference. I’m curious to see what delays looked like last January. (Although of course past performance is no guarantee of future results – or weather.)
I usually open esquisse ggplot builder from the RStudio Addins menu. The default behaves like a usual add-in. But you can also open it in your browser if you want. Just set the R option esquisse.display.mode to browser:
Now look what happens if I run the esquisse ggplot builder add-in.
It pops open my default browser. But I prefer the usual RStudio dialog pane. So, I’ll close this, set the display mode to dialog, and re-open.
First, I’ll chose one of the data frames loaded into my current working session: to_austin.
If I click on the Validate chosen variable drop-down, I’ll see all the available columns and choose which ones I want. I’ll keep them all for now, and click choose.
Now I’ve got my drag-and-drop interface. Let’s say I’d like to look at arrival delays by carrier. I’ll put OP_CARRIER in the X box and ARR_DELAY in the Y box. By the way B6 is JetBlue, DL is Delta, and WN is Southwest. It might be a little easier to see if I did a fill color by carrier also.
Hmmm. JetBlue has the lowest median flight delay but a couple of rather alarming outliers. I wonder which flights those were? I can change the X value from carrier to flight number, still coloring by carrier.
Yikes, that’s the flight I was thinking of taking, 1039. I’m flying in on a Wednesday, so maybe the mid-week data is better? See the Data panel at the bottom Tt gives me the option to filter my data.
I’m going to look just at Wednesday
Much better. Maybe I’ll take that flight after all.
Let’s see some of the other esquisse options. I can change my axis titles with Labels & Title
I can change my color palette and theme under Plot options, and also move or remove the legend. I’ll change the palette to one of my favorite ColorBrewer palettes, Dark 2.
And then play around with some themes.
Even if you’re really comfortable creating your graphs by writing ggplot code, this is a great way to see how different color palettes and themes look on your graph.
And now here’s a really cool part of this add-in. If I go to Export & Code, I have the R code that generated this ggplot graph. If I click Insert code in script, the code will appear wherever my cursor was last in RStudio. Or, safer, click Copy to clipboard, close the add-in, and copy the code into my script.
If I want, say, a bar graph of average delays by flight back from Austin, I’ll need to do a little pre-processing of the data, such as here.
Then I’ll choose the avg_delay data frame in the ggplot builder add-in.
Voila, a bar graph.
If want a graph where the bars are ordered from low to high, I’m on my own to either reorder them manually by adding ggplot code, or create ordered factors in my original data. That’s easy to do with forcats:
In those last lines, I used forcats as_factor() to create a new factor, and then reordered it based on the value of AvgDelay. Now …
Voila, an ordered bar graph.
That’s it for this episode, thanks for watching! For more R tips, head to the More With R video page at go.infoworld.com/morewithR. That’s https go dot infoworld dot com slash more with R, all lowercase except for the R. Or, you can add the “Do More With R” playlist to your YouTube library. So long, and hope to see you next episode!
Popular
Featured videos from IDG.tv