pacman::p_load(ggrepel, patchwork, ggthemes, hrbrthemes, tidyverse)Hands-on Exercise 2
1 Overview
This exercise will be introduced to several ggplot2 extension for creating statistical graphics.
2 Getting Start
2.1 Install and Load R Libraries
Beside tidyverse which is introduced in hands on exercies 1, four more R packages will be used. They are:
ggrepel: an R package provides geoms for ggplot2 to repel overlapping text labels.
ggthemes: an R package provides some extra themes, geoms, and scales for ‘ggplot2’.
hrbrthemes: an R package provides typography-centric themes and theme components for ggplot2.
patchwork: an R package for preparing composite figure created using ggplot2.
Code Chunk below is used to check the packages are installed and also load them into the current R environment.
After checking there is no issues with installing and uploading the packages, we proceed to import the data used for this exercise.
2.2 Importing Data
For this exercise, Exam_data.csv is introduced. It consist of year end examination grades of a cohort of primary 3 students from a local school.
Code chunk below imports the data mentioned above into current R environment using read_csv() function of readr package. readr is one of the tidyverse package.
exam_data <- read_csv("C:/lzc0313/ISSS608-VAA/Hands-on_Ex/Hands-on_Ex01/data/Exam_data.csv")There are 7 attributes and 322 columns in the exam_data. Four of the attributes are categorical data type and others are in continuous data type.
The categorical attributes are: ID, CLASS, GENDER and RACE.
The continuous attributes are: MATHS, ENGLISH and SCIENCE.
2.3 Beyond ggplot2 Annotation: ggrepel
One of the challenge in plotting statistical graph is annotation, especially with large number of data points.
# Your ggplot code
ggplot(data=exam_data,
aes(x= MATHS,
y=ENGLISH)) +
geom_point() +
geom_smooth(method=lm,
size=0.5) +
geom_label(aes(label = ID),
hjust = .5,
vjust = -.5) +
coord_cartesian(xlim=c(0,100),
ylim=c(0,100)) +
ggtitle("English scores versus Maths scores for Primary 3")
ggrepel is an extension of ggplot2 which provides geoms for ggplot2 to repel overlapping text as in the above graph.
We simply replace geom_text() by geom_text_repel() and geom_label() by geom_label_repel.
2.3.1 Working with ggrepel
ggplot(data=exam_data,
aes(x= MATHS,
y=ENGLISH)) +
geom_point() +
geom_smooth(method=lm,
size=0.5) +
geom_label_repel(aes(label = ID),
fontface = "bold") +
coord_cartesian(xlim=c(0,100),
ylim=c(0,100)) +
ggtitle("English scores versus Maths scores for Primary 3")
2.4 Beyond ggplot2 Themes
There are 8 build-in themes in ggplot2, they are:theme_gray(), theme_bw(), theme_classic(), theme_dark(), theme_light(), theme_linedraw(), theme_minimal(), and theme_void().
ggplot(data=exam_data,
aes(x = MATHS)) +
geom_histogram(bins=20,
boundary = 100,
color="grey25",
fill="grey90") +
theme_void() +
ggtitle("Distribution of Maths scores") 
More about ggplot2 themes (link).
2.4.1 Working with ggtheme package
ggthemes provides ‘ggplot2’ themes that replicate the look of plots by Edward Tufte, Stephen Few, Fivethirtyeight, The Economist, ‘Stata’, ‘Excel’, and The Wall Street Journal, among others.
Below is the example of a economist theme.
ggplot(data=exam_data,
aes(x = MATHS)) +
geom_histogram(bins=20,
boundary = 100,
color="grey25",
fill="grey90") +
ggtitle("Distribution of Maths scores") +
theme_economist()
2.4.2 hrbthemes package
hrbrthemes package provides a base theme that focuses on typographic elements, including where various labels are placed as well as the fonts that are used.
ggplot(data=exam_data,
aes(x = MATHS)) +
geom_histogram(bins=20,
boundary = 100,
color="grey25",
fill="grey90") +
ggtitle("Distribution of Maths scores") +
theme_ipsum()
The second goal of hrbrthemes centers around the productivity for a production workflow. learn more
ggplot(data=exam_data,
aes(x = MATHS)) +
geom_histogram(bins=20,
boundary = 100,
color="grey25",
fill="grey90") +
ggtitle("Distribution of Maths scores") +
theme_ipsum(axis_title_size = 18,
base_size = 15,
grid = "Y")
What can we learn from the above code chunk?
axis_title_size argument is used to increase the font size of the axis title to 18,
base_size argument is used to increase the default axis label to 15, and
grid argument is used to remove the x-axis grid lines.
2.5 Beyond Single Graph
It is not unusual that multiple graphs are required to tell a compelling visual story. There are several ggplot2 extensions provide functions to compose figure with multiple graphs. In this section, you will learn how to create composite plot by combining multiple graphs. First, let us create three statistical graphics by using the code chunk below.
p1 <- ggplot(data=exam_data,
aes(x = MATHS)) +
geom_histogram(bins=20,
boundary = 100,
color="grey25",
fill="grey90") +
coord_cartesian(xlim=c(0,100)) +
ggtitle("Distribution of Maths scores")Next,
p2 <- ggplot(data=exam_data,
aes(x = ENGLISH)) +
geom_histogram(bins=20,
boundary = 100,
color="grey25",
fill="grey90") +
coord_cartesian(xlim=c(0,100)) +
ggtitle("Distribution of English scores")Lastly, a scatterplot.
p3 <- ggplot(data=exam_data,
aes(x= MATHS,
y=ENGLISH)) +
geom_point() +
geom_smooth(method=lm,
size=0.5) +
coord_cartesian(xlim=c(0,100),
ylim=c(0,100)) +
ggtitle("English scores versus Maths scores for Primary 3")2.5.1 Creating Composite Graphics: pathwork methods
Pathwork from the ggplot2 extension is specially designed for combining separate ggplot2 graphs into a single figure.
Patchwork package has a very simple syntax where we can create layouts super easily. Here’s the general syntax that combines:
Two-Column Layout using the Plus Sign +.
Parenthesis () to create a subplot group.
Two-Row Layout using the Division Sign
/
2.5.2 Combining two ggplot2 graphs
below codes combines two histogram.
p1 + p2
2.5.3 Combining three ggplot2 graphs
We can plot more complex composite by using appropriate operators. For example, the composite figure below is plotted by using:
“|” operator to stack two ggplot2 graphs,
“/” operator to place the plots beside each other,
“()” operator the define the sequence of the plotting.
(p1 / p2) | p3
More about this topics, refer to Plot Assembly.
2.5.4 Creating a composite figure with tag
To identify the subplots in text, patchwork also provides auto-tagging capabilities shown below.
((p1 / p2) | p3) +
plot_annotation(tag_levels = 'I')
2.5.5 Creating figure with insert
Other than putting the graphs side by side, we can put graph on top or below one another using inset_element() of patchwork.
p3 + inset_element(p2,
left = 0.02,
bottom = 0.7,
right = 0.5,
top = 1)
2.5.6 Creating composite figure using patchwork and ggtheme
Figure below is created by combining patchwork and theme_economist() of ggthemes package discussed earlier.
patchwork <- (p1 / p2) | p3
patchwork & theme_economist()