I recently found an interesting R package that animates ggplot2 plots, namely gganimate. This notebook shows some functionalities of it, solely for learning purposes. As a test data set I use gapminder like in the examples of the gganimate package. Nevertheless, I am interested, whether it would support in me in these areas_

  1. Data analysis, i.e., does it help to generate insights more efficiently?
  2. Data visualization, i.e., does it help to better communicate findings?

This notebook is organized as follows: Setup, first I show necessary software and settings as well as give an data overview. In Non-animated plots I show some plots that I would create in order to assess the data without any animation for a later comparison. In section Animated plots I try gganimate to create animated plots. I conclude in the last section Conclusion.

Contact: gresch

Setup

Software

  1. Install R
  2. Install RStudio
  3. Install ImageMagick
  4. Install gganimate via devtools::install_github("dgrtwo/gganimate")

Libraries

library(gganimate)       # animation package
library(ggplot2)         # plotting package
library(gapminder)       # package with data to visualize
library(dplyr)           # package for data transformation

Settings

theme_set(theme_bw())    # set theme to black and white

Data overview

The data is an excerpt of the Gapminder data on life expectancy, GDP per capita, and population by country. It includes 1704 observations with six variables:

  • country
  • continent
  • year
  • lifeExp
  • pop
  • gdpPerCap
knitr::kable(summary(gapminder))
country continent year lifeExp pop gdpPercap
Afghanistan: 12 Africa :624 Min. :1952 Min. :23.60 Min. :6.001e+04 Min. : 241.2
Albania : 12 Americas:300 1st Qu.:1966 1st Qu.:48.20 1st Qu.:2.794e+06 1st Qu.: 1202.1
Algeria : 12 Asia :396 Median :1980 Median :60.71 Median :7.024e+06 Median : 3531.8
Angola : 12 Europe :360 Mean :1980 Mean :59.47 Mean :2.960e+07 Mean : 7215.3
Argentina : 12 Oceania : 24 3rd Qu.:1993 3rd Qu.:70.85 3rd Qu.:1.959e+07 3rd Qu.: 9325.5
Australia : 12 NA Max. :2007 Max. :82.60 Max. :1.319e+09 Max. :113523.1
(Other) :1632 NA NA NA NA NA
knitr::kable(head(gapminder, 15))
country continent year lifeExp pop gdpPercap
Afghanistan Asia 1952 28.801 8425333 779.4453
Afghanistan Asia 1957 30.332 9240934 820.8530
Afghanistan Asia 1962 31.997 10267083 853.1007
Afghanistan Asia 1967 34.020 11537966 836.1971
Afghanistan Asia 1972 36.088 13079460 739.9811
Afghanistan Asia 1977 38.438 14880372 786.1134
Afghanistan Asia 1982 39.854 12881816 978.0114
Afghanistan Asia 1987 40.822 13867957 852.3959
Afghanistan Asia 1992 41.674 16317921 649.3414
Afghanistan Asia 1997 41.763 22227415 635.3414
Afghanistan Asia 2002 42.129 25268405 726.7341
Afghanistan Asia 2007 43.828 31889923 974.5803
Albania Europe 1952 55.230 1282697 1601.0561
Albania Europe 1957 59.280 1476505 1942.2842
Albania Europe 1962 64.820 1728137 2312.8890

Non-animated plots

Before testing gganimate I will create some plots via ggplot. For this analysis I will concentrate on the relationship between GDP per capita and life expectancy at birth. To increase readability, I log10ed the x-axis and used color to differentiate between observations coming from different continents. It shows a somewhat positive relationship between these two variables.

ggplot(gapminder) +
  geom_point(aes(gdpPercap, lifeExp, color = continent)) + 
  scale_x_log10()

What are possible ways to look at the development of these variables over time (e.g., years)? Here are some means:

  1. Scatter plot: use of facet_grid(~year)
  2. Area: Show averages and use facet_grid(~year)

Scatterplot: use of facet_grid(~year)

It overall shows a positive development of life expectancy and GDP per capita over the years, with Europe leading and Africa tailing.

ggplot(gapminder) +
  geom_point(aes(gdpPercap, lifeExp, color = continent)) + 
  scale_x_log10() + 
  facet_grid(~year)

Area: Show averages and use facet_grid(~year)

This plot also shows a positive development of life expectancy and GDP per capita over the years, without showing differentiation for continents.

gapminder %>%
  group_by(continent, year) %>% 
  summarise(meanLifeExp = round(mean(lifeExp), 0),
            meanGdpPercap = round(mean(gdpPercap), 0)) %>%
  ggplot() +
  geom_area(aes(meanGdpPercap, meanLifeExp), stat = "identity") +
  scale_x_log10() +
  facet_grid(~year)

Animated plots

The package gganimate gives me another possibility: to have initial plot for every year provided and to display them in a series. In this case the variable year needs to be set as the frame.

Animated scatterplot

The animated scatter plot shows me the initial plot (i.e., two variables on the axes and the one variable as color fill). In addition, each frame shows a specific year. Frames change every second. Hence, it is possible to view developments over time.

> gg_animate(ggplot(gapminder) +
+             geom_point(aes(gdpPercap, lifeExp, color = continent, frame = year)) + 
+             scale_x_log10())

A facet_grid(~continent) further allows now a differentiation on the individual continent developments.

> gg_animate(ggplot(gapminder) +
+             geom_point(aes(gdpPercap, lifeExp, color = continent, frame = year)) + 
+             scale_x_log10() +
+              facet_grid(~continent)) 

Line chart

> gg_animate(gapminder %>%
+   group_by(continent, year) %>% 
+   summarise(meanLifeExp = round(mean(lifeExp), 0),
+             meanGdpPercap = round(mean(gdpPercap), 0)) %>%
+   ggplot() +
+   geom_line(aes(meanGdpPercap, meanLifeExp, frame = year), stat = "identity") +
+   scale_x_log10()) 

Conclusion

gganimate is an interesting package with some advantages. I will try to use it more often from now on. One open question would be whether it is possible to change width and height of the animated plots.

1. Data analysis, i.e., does it help to generate insights more efficiently?

In terms of data analysis it brings only few benefits. Other plots and means of exploration are available to understand the data, such as histograms, table view, or facet grids.

2. Data visualiazation, i.e., does it help to better communicate findings?

In terms of data visualization I believe gganimate is helpful. The frame provides another dimension to a plot without the need to use color, sizes, or other means that might clutter the plot. Another advantage is that I can tell a story and its development over time (or other variable). Lastly, I think adding animation to visualization is much more compelling and people might engange much more with the plot.