To make a bar graph from the sample BOD data frame included with R, the basic R function is
barplot(). To plot the demand column from the BOD data set on a bar graph, you can use the command:
main="Graph of demand" if you want a main headline on your graph:
barplot(BOD$demand, main="Graph of demand")
To label the bars on the x axis, use the
names.arg argument and set it to the column you want to use for labels:
barplot(BOD$demand, main="Graph of demand", names.arg = BOD$Time)
Sometimes you'd like to graph the counts of a particular variable but you have only raw data, not a table of frequencies. R's
table() function is a quick way to generate counts for each factor in your data.
The R Graphics Cookbook uses an example of a bar graph for the number of 4-, 6- and 8-cylinder vehicles in the mtcars data set. Cylinders are listed in the
cyl column, which you can access in R using
Here's code to get the count of how many entries there are by cylinder with the
table() function; it stores results in a variable called
cylcount <- table(mtcars$cyl)
That creates a table called
4 6 8
11 7 14
Now you can create a bar graph of the cylinder count:
qplot() quick plotting function can also create bar graphs:
Creating a bar plot.
However, this defaults to an assumption that 4, 6, and 8 are part of a variable set that could run from 4 through 8, so it shows blank entries for 5 and 7.
To treat cylinders as distinct groups -- that is, you have a group with 4 cylinders, a group with 6, and a group with 8, not the possibility of entries anywhere between 4 and 8 -- you want cylinders to be treated as a statistical factor:
To create a bar graph with the more robust
ggplot() function, you can use syntax such as:
ggplot(mtcars, aes(factor(cyl))) + geom_bar()
Histograms work pretty much the same, except you want to specify how many buckets or bins you want your data to be separated into. For base R graphics, use:
hist(mydata$columnName, breaks = n)
In this example,
columnName is the name of your column in a
mydata dataframe that you want to visualize, and
n is the number of bins you want.
What happens to your bar chart when you don't instruct R not to plot continuous variables.
The ggplot2 commands are:
qplot(columnName, data=mydata, binwidth=n)
For quick plots and for the more robust
ggplot(mydata, aes(x=columnName)) + geom_histogram(binwidth=n)
You may be starting to see strong similarities in syntax for various
ggplot() examples. While the
ggplot() function is somewhat less intuitive, once you wrap your head around its general principles, you can do other types of graphics in a similar way.