Four Tricks for Enhancing ggplot2 Visualizations

R
ggplot2
Data Visualization
Author

Ken Vu

Published

October 1, 2023

Introduction

Throughout graduate school, I began to develop a fascination with R’s ggplot2 library which contains a variety of ways you can visualize your data (many of which are too vast to summarize in a single blog); I liked that ggplot2 gives you so many ways to present your data as well as ways to enhance it to your needs and preferences, so much that I’d find every excuse to use it whenever I have data visualizations to make in class and outside of it.

This fascination with ggplot2 meant that I had to do a lot of googling and combing through numerous online resources, StackOverflow posts, and textbooks to get the answers I need; even with all the knowledge, R-related books, and so on that we have so far, it can get challenging to figure out what tools to use, which ones exist, and when to use them, given how fast R continues to evolve over time. While constantly looking up information is essential to success in any educational journey you go on, it can be frustrating to do, especially for newcomers to R that’re unsure about where to start.

Thus, for this blog, I’ll share will you some of the tips and tricks I employ to enhance my data visualizations. For simplicity’s sake, let’s dive into them with a scatterplot of data from one of the most ubiquitous and most commonly used (or overused) data sets of all time - the mtcars data set.

Tricks for Enhancing ggplot2 Visualizations

To begin, we need to have the ggplot2 package installed already so we can load them in (which is done in the code chunk below).

If you don’t have it installed, you can get the package by running the command install.packages("ggplot2") in your R console or scripting environment one time. Then, once done, you can now run the command in the code chunk below.

library(ggplot2)

Now that we have the ggplot2 loaded, we can start creating and enhancing our own data visualizations using the mtcars data set (which already comes installed and preloaded into R when you run it so there’s no need to load or call in the data set).

Here, we create a scatterplot of the car’s weight in tons (i.e., wt) being plotted against the car’s respective mileage in miles per gallon (i.e., mpg). We save this scatterplot to the object plt1 so we can further reuse and modify it later.

plt1 <- mtcars |> 
  ggplot(mapping = aes(x = wt, y = mpg)) +
  geom_point()
plt1

With this scatterplot set up, we can now modify it and demonstrate some of tricks for enhancing them below.

TRICK 1: Modify the plot labels.

You can add labels to the plot’s title and axes as a way to thoroughly give the audience more information on what the plot’s trying to communicate.

By default, the axes are the only elements labeled (which are taken directly from the names of the column being passed in).

Using the labs() function below, we can add onto plt1 labels for the x and y-axes.

plt1 <- plt1 +
  
  # adding labels to axes
  labs(x = "Weight (tons)",
       y = "Miles Per Gallon (mpg)")
plt1

We can also put a title and (optionally) a subtitle and caption for for the plot. The title helps to indicate the subject matter of the plot and the subtitle can help provide more context to it.

As for the caption, you can use it to add a footnote to the plot, which typically has been used to indicate the source of the data set. The caption appears on the bottom right portion of the plot, but you can certainly alter where it’s positioned (as you’ll learn later on in this section).

plt1 <- plt1 +
  
  labs(
       # adding title to plot
       title = "Car Weight vs Car Mileage",
       
       # optional
       subtitle = "How much does a car's weight affects its mileage?",
       caption = "source: Henderson and Velleman (1981), Building multiple regression models interactively. Biometrics, 37, 391–411."
       )
plt1

Furthermore, you can set alt-text using the parameter alt in labs() to provide alternative descriptions for those using screen readers to view your scatterplots.

plt1 <- plt1 + 
  
  # modifying plot labels
  labs(
       # alt text
       alt = "A scatterplot modeling a car's weight on the x axis (with a range from 1 ton to 6 tons) and a car's mileage on the yaxis (with a range from 10 miles per gallon to 35 miles per gallon) for 32 automobile brands with the data obtained from the 1974 Motor Trend magazine.  The plot shows a negative linear association between a car's weight and a car's mileage where as the car's weight increases, its mileage generally decreases in a roughly linear fashion."
       )
plt1

You can also adjust the alignment of the title of the plot as well as its font size and the typeface of it (i.e., bold, italic, etc).

In fact, you can add the function theme() as a layer on top of your ggplot graph to make further modifications to the visual aesthetics of your plot.

Here, I’m going to modify the plot’s title and subtitle to be centered with the plot title being larger and in boldface. I’ll also make the caption smaller as well; we can achieve these text adjustments by passing the function element_text() to plot.title, plot.subtitle, and plot.caption in the theme() function respectively.

element_text() is a function in ggplot2 used primarly in the theme() gpplot2 layer that allows you to control the text aesthetics of any text-related element on your graph.

plt1 <- plt1 + 
  
  # changing text of plot title and plot subtitle
  theme(plot.title = element_text(hjust = 0.5, size = 24,
                                  face = "bold"),
        plot.subtitle = element_text(hjust = 0.5),
        
        # changing size of caption
        plot.caption = element_text(size = 7)
        )
plt1

You can see in the plot above how much neater it looks with a resized title and properly labeled axes.

TRICK 2: Remove the major and minor gridlines

Personally, I don’t mind major and minor gridlines on the plot as they can help the audience better gauge where points are on the plot.

Nevertheless, the option to remove them is there. In the theme() function, if you want to remove any particular element in the plot, you can set that element equal to the function element_blank().

Here, we remove all major gridlines and minor gridlines by respectively setting panel.grid.major = element_blank() and panel.grid.minor = element_blank().

plt1 <- plt1 + 
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())
plt1

Using different arguments, you can also control which specific major/minor gridlines you want to remove (i.e., the ones along the x-axis and/or on the y-axis).

To showcase all of them, I’ll recreate the same result in the previous plot, but through manually removing each gridline (major and minor) along both the x and y axes. Depending on your plotting needs, you can choose to use either, some, or all four of the arguments (see code below).

As before, if you want to remove any particular element on your plot (especially an element not turned off by default such as axes tick marks, for example), identify the parameter in theme() you want to modify (i.e., panel.grid.major, axis.line.x, etc) and set it equal to element_blank() or NULL.

plt1 + 
  theme(
        # removing major and minor gridlines for x-axis
        panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        
        # removing major and minor gridlines for y-axis
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank()
        )

TRICK 3: Add in borders around the plotting area

Adding borders alongside the plotting area can help separate the axes of the plot from the area where the data points are plotted for ease of readability.

To add borders alongside the entire plotting area using theme(), we can go to the parameter panel.border and pass in an element_rect() object to add in a border; for this object (as with element_text, element_line, and other element-related objects), you can modify its color and size.

Here, for the element_rect() function passed into the parameter panel.border, we have color = "black", fill = NA (to keep the element_rect() from coloring in the rectangle and removing all the data points), and size = 0.75.

plt1 <- plt1 +
  theme(panel.border = element_rect(color = "black",
                                    fill = NA,
                                    size = 0.75))
Warning: The `size` argument of `element_rect()` is deprecated as of ggplot2 3.4.0.
ℹ Please use the `linewidth` argument instead.
plt1

TRICK 4: Change the panel background color.

As you can see below, the background of the plotting area or panel is grey, which can be changed.

plt1

In the theme() function, you can change the background color of the plotting area by passing the element_rect() function to the parameter panel.background, which ONLY controls the appearance of the plotting area and NOT the area outside the plotting area itself.

In the element_rect() function, you pass the argument fill = "white" into it to make the plotting area (i.e., the “panel”) white (see code and results below).

plt1 +
  theme(panel.background = element_rect(fill = "white"))

In general, if you want to make the background (or any element of the graph) a different color, replace "white" in the argument fill = "white" (or whichever argument that controls color for that element such as color for example) with any base color (i.e., "black", "blue", "green", etc) or the hex code of a custom color (i.e., looks something like "#9e8d47" for example).

You can use this color picker here to find the hex codes of custom colors you want to incorporate: https://htmlcolorcodes.com/color-picker/

Conclusion

Overall, these are some options I’ve used in the past that you can use to great effect to modify or enhance your ggplot2 data visualizations. There are certainly a lot more beyond what I can cover in this blog post, but I do hope they prove useful to you moving forward.

Should you ever want to learn more about ggplot2’s wide array of customization features and functions, I recommend checking out some of the resources I listed in the Resources section below.

Otherwise, thank you for reading this blog and stay tuned for more from The R Files blog.

Resources

I recommend the following resources for enhancing your exploration and experimentation with the ggplot2 library.

  • R for Data Science: This book (now in it’s second edition) is a classic and covers some of the bare essentials needed to work with and display all kinds of data as well as strategies for writing clean code. Chapters 10-12 are relevant for those focused on data visualization in general as well as Chapters 2-9 for generally good practices for writing and maintaining clean code. You can read it online here: https://r4ds.hadley.nz/

  • ggplot2: Elegant Graphics for Data Analysis: This is a good book that explains the grammar of graphics of ggplot2 and how it works under the surface. Chapters 3-5, 8, 9, 10-14, and 17 are some chapters I recommend for understanding more of the basics of ggplot2’s plotting functions, the steps for making ggplot2 graphs in general, and some of the ways in which the plot aesthetics are made and how they can be modified. You can read the work-in-progress version online here: https://ggplot2-book.org/

  • Data Visualization with R by Rob: It’s a direct guide on building plots with ggplot2 along with best practices for data visualizations in general. Chapters 3-6, 11, and 14 directly focus on ggplot2, its wide array of customization for the aesthetics of its plots, and best practices for creating effective and visually sound graphs. Chapter 10 is also interesting if you’d like to explore other graphs besides the conventional 2D bar plots and scatterplots you see often in ggplot2, such as dumbbell plots and heat maps. Currently, it’s only available as an online bookdown, but a book version is reportedly in the works of being available on Amazon soon.