Fast, easy improvements to bar charts in ggplot2

geom_col() makes a straightforward bar chart in ggplot2, but at default settings, the result may be unusable.

Consider this plot of Catholicism in 19th century Switzerland. Notice (1) the horizontal axis is unreadable, (2) the bars don’t seem to be in any meaningful order, and (3) the whole plot is a boring grayscale.

library(datasets)
library(data.table)
library(ggplot2)

swiss = as.data.table(swiss, keep.rownames = TRUE)

#> str(swiss)
#Classes ‘data.table’ and 'data.frame':	47 obs. of  7 variables:
# $ rn              : chr  "Courtelary" "Delemont" "Franches-Mnt" "Moutier" ...
# $ Fertility       : num  80.2 83.1 92.5 85.8 76.9 76.1 83.8 92.4 82.4 82.9 ...
# $ Agriculture     : num  17 45.1 39.7 36.5 43.5 35.3 70.2 67.8 53.3 45.2 ...
# $ Examination     : int  15 6 5 12 17 9 16 14 12 16 ...
# $ Education       : int  12 9 5 7 15 7 7 8 7 13 ...
# $ Catholic        : num  9.96 84.84 93.4 33.77 5.16 ...
# $ Infant.Mortality: num  22.2 22.2 20.2 20.3 20.6 26.6 23.6 24.9 21 24.4 ...

ggplot(swiss, aes(rn, Catholic)) +
  geom_col()

geom_col_defaults

Let’s solve the first problem by rotation. We’ll also relabel “rn” to something more descriptive. Of note, in ggplot2, once an x axis, always an x axis, irrespective of coord_flip().

ggplot(swiss, aes(rn, Catholic)) +
  geom_col() +
  coord_flip() +
  labs(x = "province")

geom_col_flipped_xlab

With province names unobstructed we see the bars actually were in some order (reverse alphabetical) but you’re more likely to want to sort A-to-Z or by percentage Catholic.

Either choice requires just one extra line of code with scale_x_discrete(). The limits argument accepts a character vector.

## A-to-Z order
## (plot not shown)
ggplot(swiss, aes(rn, Catholic)) +
  geom_col() +
  coord_flip() +
  labs(x = "province") +
  scale_x_discrete(limits = swiss[order(-rn), rn])

## descending order: most to least Catholic
## for ascending order, use order(-Catholic) instead
ggplot(swiss, aes(rn, Catholic)) +
  geom_col() +
  coord_flip() +
  labs(x = "province") +
  scale_x_discrete(limits = swiss[order(Catholic), rn])

geom_col_ordered

To finish, we’ll round out the labels and add color for flair.

ggplot(swiss, aes(rn, Catholic)) +
  geom_col(color = "red", # outline of each bar
           fill = "red", # main color
           alpha = 0.8) + # 80% opacity for the fill
  coord_flip() +
  labs(x = "province", 
       y = "percent of residents", 
       title = "Catholicism in 19th century Switzerland") +
  scale_x_discrete(limits = swiss[order(Catholic), rn]) +
  theme_minimal() # white background

geom_col_color

The datasets package documentation contains a more extensive description of what is visualized here, if you’re curious.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s