# Beginner's guide to R: Painless data visualization

## Learn how to paint a picture with data with R, using just a couple lines of code

``` ```
Page 3 of 6

Bar graphs

To make a bar graph from the sample BOD data frame included with R, the basic R function is `barplot()`. To plot the demand column from the BOD data set on a bar graph, you can use the command:

`barplot(BOD\$demand)`

Add `main="Graph of demand"` if you want a main headline on your graph:

`barplot(BOD\$demand, main="Graph of demand")`

To label the bars on the x axis, use the `names.arg` argument and set it to the column you want to use for labels:

`barplot(BOD\$demand, main="Graph of demand", names.arg = BOD\$Time)`

Sometimes you'd like to graph the counts of a particular variable but you have only raw data, not a table of frequencies. R's `table()` function is a quick way to generate counts for each factor in your data.

The R Graphics Cookbook uses an example of a bar graph for the number of 4-, 6- and 8-cylinder vehicles in the mtcars data set. Cylinders are listed in the `cyl` column, which you can access in R using `mtcars\$cyl`.

Here's code to get the count of how many entries there are by cylinder with the `table()` function; it stores results in a variable called `cylcount`:

`cylcount <- table(mtcars\$cyl)`

That creates a table called `cylcount` containing:

4 6 8

11 7 14

Now you can create a bar graph of the cylinder count:

`barplot(cylcount)`

The `qplot()` quick plotting function can also create bar graphs:

`qplot(mtcars\$cyl)`

However, this defaults to an assumption that 4, 6, and 8 are part of a variable set that could run from 4 through 8, so it shows blank entries for 5 and 7.

To treat cylinders as distinct groups -- that is, you have a group with 4 cylinders, a group with 6, and a group with 8, not the possibility of entries anywhere between 4 and 8 -- you want cylinders to be treated as a statistical factor:

`qplot(factor(mtcars\$cyl))`

To create a bar graph with the more robust `ggplot()` function, you can use syntax such as:

`ggplot(mtcars, aes(factor(cyl))) + geom_bar()`

Histograms

Histograms work pretty much the same, except you want to specify how many buckets or bins you want your data to be separated into. For base R graphics, use:

`hist(mydata\$columnName, breaks = n)`

In this example, `columnName` is the name of your column in a `mydata` dataframe that you want to visualize, and `n` is the number of bins you want.

The ggplot2 commands are:

`qplot(columnName, data=mydata, binwidth=n)`

For quick plots and for the more robust `ggplot()`:

`ggplot(mydata, aes(x=columnName)) + geom_histogram(binwidth=n)`

You may be starting to see strong similarities in syntax for various `ggplot()` examples. While the `ggplot()` function is somewhat less intuitive, once you wrap your head around its general principles, you can do other types of graphics in a similar way.

| Page 3
``` ```