class: center, middle, inverse, title-slide # Introduction to ggplot2 ## Data Visualization Week 2020 ### Ijeamaka Anyene --- ## What is ggplot2? .pull-left[ <br><br>An R package that is part of the tidyverse <br><br>Specifically designed for the creation of plots <br><br>Easy to iterate and build on ] .pull-right[ <img src="supplemental_graphics/ggplot2.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Quick FYI! `ggplot2` is a huge package with a ton of functionality to create plots, this presentation will only skim the surface. <br><br> This will be more of a "watch and learn" type of presentation versus a "follow along" <br>But all the code to generate plots will be visible and thus available to you! --- ## Data: {palmerpenguins} R package Contains size measurements for three penguin species observed on three islands in the Palmer Archipelago, Antarctica. |species | body_mass_g| bill_length_mm|sex | |:-------|-----------:|--------------:|:------| |Adelie | 3750| 39.1|male | |Adelie | 3800| 39.5|female | |Adelie | 3250| 40.3|female | |Adelie | 3450| 36.7|female | |Adelie | 3650| 39.3|male | |Adelie | 3625| 38.9|female | --- class: center, middle ## So.. what's the gg in ggplot2? <br><br> #ggplot2 is the **g**rammar of **g**raphics **plot** ... 2 --- ## Grammar of Graphics > "Grammar of graphics is a tool that enables us to concisely describe the components of a graphic" <br><br> **The Grammar of Graphics by Leland Wilkinson** * Proposed an idea of building up a graphic from multiple "layers of data" <br><br> **A Layered Grammar of Graphics by Hadley Wickham** * "Layers" implemented in ggplot2 (with some tweaks and expansion on the concept) * Thinking behind the implementation of the concept in `ggplot2()` --- ## Grammar of Graphics <img src="supplemental_graphics/layered_graphics.png" width="80%" style="display: block; margin: auto;" /> --- ## Grammar of Graphics - Essentials .pull-left[ **Data** ```r ggplot( ) ``` ] .pull-right[ <br> The source of information for your visualization. <br><br>`ggplot()` requires your data to be 'tidy' * Every variable has a column * Every observation has a row ] --- ## Grammar of Graphics - Essentials .pull-left[ **Data** ```r ggplot(data = penguins) penguins %>% ggplot() ``` ] .pull-right[ <br> The source of information for your visualization. <br><br>`ggplot()` requires your data to be 'tidy' * Every variable has a column * Every observation has a row ] --- ## Grammar of Graphics - Essentials .pull-left[ ```r ggplot(data = penguins) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/plot-label-out-1.png)<!-- --> ] --- ## Grammar of Graphics - Essentials .pull-left[ **Mapping** ```r aes(x = , y = ) ``` ] .pull-right[ <br> Crosswalk between variables in the dataset and how they are visualized in the plot * body mass g -> x * bill length mm -> y ] --- ## Grammar of Graphics - Essentials .pull-left[ **Mapping** ```r aes(x = body_mass_g, y = bill_length_mm) ``` ] .pull-right[ <br> Crosswalk between variables in the dataset and how they are visualized in the plot * body mass g -> x * bill length mm -> y ] --- ## Grammar of Graphics - Essentials .pull-left[ **Geometric Object** ```r geom_*() ``` ] .pull-right[ <br> Controls the type of plot you create * geom_point() = scatter plot * geom_line() = line plot * geom_col() or geom_bar() = bar graph ...and more!!! ] --- ## Grammar of Graphics - Essentials .pull-left[ **Geometric Object** ```r geom_point() ``` ] .pull-right[ <br> Controls the type of plot you create * geom_point() = scatter plot * geom_line() = line plot * geom_col() or geom_bar() = bar graph ...and more!!! ] --- ## Grammar of Graphics - Essentials .pull-left[ **Mapping + Geometric Object** ```r # Method 1 ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm)) # Method 2 ggplot(data = penguins) + geom_point(aes(x = body_mass_g, y = bill_length_mm)) # Method 3 ggplot() + geom_point(data = penguins, aes(x = body_mass_g, y = bill_length_mm)) ``` ] .pull-right[ <br> Method 1: Best when using one data set and one aesthetic mapping <br><br> Method 2: Best when using one data set, and multiple geoms + aesthetic mappings <br><br> Method 3: Best when using multiple data sets, and multiple geoms + aesthetic mappings ] --- ## Grammar of Graphics - Essentials .pull-left[ ```r ggplot(data = penguins, * aes(x = body_mass_g, * y = bill_length_mm)) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/plot-label-2-out-1.png)<!-- --> ] --- ## Grammar of Graphics - Essentials .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm)) + * geom_point() ``` <br>Understands the following aesthetics * x * y * alpha * colour * fill * shape * size * etc. ] .pull-right[ ![](ggplot_presentation_files/figure-html/plot-label-3-out-1.png)<!-- --> ] --- ## Grammar of Graphics - Essentials .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm, * colour = sex)) + geom_point() ``` <br><br>Add another dimension to the plot <br>by using `colour = ` ] .pull-right[ ![](ggplot_presentation_files/figure-html/plot-label-4-out-1.png)<!-- --> ] --- ## Grammar of Graphics - Essentials .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm)) + * geom_point(aes(shape = as.factor(sex))) ``` <br><br>Change the shape of the scatter plot, by using `aes()` ] .pull-right[ ![](ggplot_presentation_files/figure-html/plot-label-6-out-1.png)<!-- --> ] --- ## Grammar of Graphics - Essentials .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm)) + * geom_point(colour = "blue") ``` <br><br>Map an aesthetic to a fixed value, by calling it outside of `aes()` ] .pull-right[ ![](ggplot_presentation_files/figure-html/plot-label-5-out-1.png)<!-- --> ] --- ## Grammar of Graphics - Essentials .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm)) + * geom_point(shape = 21, * colour = "blue", * fill = "red") ``` <br><br>For shapes (or geoms) that have a border you can make the border (colour) and the inside (fill) different colors. ] .pull-right[ ![](ggplot_presentation_files/figure-html/plot-label-7-out-1.png)<!-- --> ] --- ## Grammar of Graphics - Essentials .pull-left[ ```r *g = ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm, colour = sex)) + geom_point() *g ``` <br><br> Save your `ggplot` to an object! This object will be used to add additional layers in our graphic ] .pull-right[ ![](ggplot_presentation_files/figure-html/plot-label-8-out-1.png)<!-- --> ] --- ## Grammar of Graphics - Essentials <br>You can change the labels of the axes, as well as modify the limits of the axes. <br><br>This is done by using: * `xlab()` and `ylab()` to change the labels * `xlim()` and `ylim()` to change the limits <br>Note this can also be done by using `labs()` - to be covered later! --- ## Grammar of Graphics - Essentials .pull-left[ ```r g + xlab("Body Mass (g)") + ylab("Bill Length (mm)") ``` <br>Modified labels using `xlab()` and `ylab()` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-8-1.png)<!-- --> ] --- ## Grammar of Graphics - Essentials .pull-left[ ```r g + xlim(3000, 4000) ``` <br> Limiting the x-axis to body mass between 3000 and 4000 g ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-10-1.png)<!-- --> ] --- ## Additional Layer 1: Scales <b>Controls how the data is mapped to the aesthetic</b> <br>Two main scale methods: * Position Scales -> Affects your axes * Color Scales -> Affects your colors/legends .pull-left[ <br> <b> Position Scales</b> * Change labels * Change breaks * Modify limits * Stat transformations ] .pull-right[ <br> <b> Color Scales </b> * Change color mapping * Change breaks * Modify limits ] --- ## Additional Layer 1: Scales <b>Controls how the data is mapped to the aesthetic</b> <br>All scale functions tend to follow a similar formula: `scale` + `_ ` + `<aes>` + `_` + `<type>` + `()` <br>Where: * aes = What variable do you want to change? * type = What type of variable? --- ## Additional Layer 1: Scales <b>Controls how the data is mapped to the aesthetic</b> <br>All scale functions tend to follow a similar formula: `scale` + `_ ` + `<aes>` + `_` + `<type>` + `()` .pull-left[ <br> If we wanted to use a different color palette for a continuous variable: * aesthetic variable = color * type of variable = continuous `scale_color_continuous()` ] .pull-right[ <br> If we wanted to change the limits for the date variable on the x axis: * aesthetic variable = x * type of variable = date `scale_x_date()` ] --- ## Additional Layer 1: Scales .pull-left[ <b> Position scales - change your axes </b> ```r g + scale_x_continuous( limits = c(3000, 4000)) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-12-1.png)<!-- --> ] --- ## Additional Layer 1: Scales .pull-left[ <b> Position scales - change your axes </b> ```r g + scale_x_continuous("Body Mass (g)") ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-14-1.png)<!-- --> ] --- ## Additional Layer 1: Scales .pull-left[ <b> Position scales - change your axes </b> ```r g + scale_x_continuous( breaks = c(4000, 6000)) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-16-1.png)<!-- --> ] --- ## Additional Layer 1: Scales .pull-left[ <b> Colour scales - change your legends </b> ```r col = c("female" = "#ee6f57", "male" = "#1f3c88") g + scale_color_manual( values = col) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-18-1.png)<!-- --> ] --- ## Additional Layer 1: Scales .pull-left[ <b> Colour scales - change your legends</b> ```r col = c("female" = "#ee6f57", "male" = "#1f3c88") g + scale_color_manual( values = col, labels = c("Female Penguins", "Male Penguins")) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-20-1.png)<!-- --> ] --- ## Additional Layer 2: Coordinate systems Two main uses: + Combine two position aesthetics (x and y) to produce 2D position on plot + Draw axes and panel backgrounds <br>Commonly Used `coord_*()`: * `coord_flip()` - flips your position aesthetics * `coord_cartesian()` - cartesian plane * `coord_polar()` - polar plane * and more! --- ## Additional Layer 2: Coordinate systems .pull-left[ `coord_flip()` flips your x and y axes ```r g + coord_flip() ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-22-1.png)<!-- --> ] --- ## Additional Layer 2: Coordinate systems .pull-left[ `coord_polar()` converts your chart into a circle ```r g + coord_polar() ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-24-1.png)<!-- --> ] --- ## Additional Layer 3: Facets <br>Generates small multiples that each show a different subset of the data <br> Alternative to aesthetics for displaying additional categorical/discrete variables <br>Two `facet_*()` options: <br> `facet_grid()` and `facet_wrap()` --- ## Additional Layer 3: Facets .pull-left[ ```r g + facet_wrap(~species) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-26-1.png)<!-- --> ] --- ## Additional Layer 3: Facets .pull-left[ ```r g + facet_grid(row = vars(species), cols = vars(year)) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-28-1.png)<!-- --> ] --- ## Honorable Mentions: Labs & Themes `labs()` and `themes()` are not considered "layers" as part of the Grammar of Graphics. <br><br> -- <b> Why? </b> <br>They only control non-data elements of the plot. --- ## Honorable Mentions: Labs .pull-left[ Labs allow you to change the labels of the plot ```r g + labs(x = "Body Mass (g)", y = "Bill Length (mm)", title = "Palmer Penguins", subtitle = "Bill Length (mm) vs. Body Mass (g)", caption = "Created by Ijeamaka") ``` <br> This is similar to `xlab()` and `ylab()` but contained in one function ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-30-1.png)<!-- --> ] --- ## Honorable Mentions: Labs .pull-left[ ```r g + labs(x = NULL, y = NULL, title = "Palmer Penguins", subtitle = "Bill Length (mm) vs. Body Mass (g)", caption = "Created by Ijeamaka") ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-32-1.png)<!-- --> ] --- ## Honorable Mentions: Themes .pull-left[ Control the plot appearance/decorations ```r g + theme_minimal() ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-34-1.png)<!-- --> ] --- ## Honorable Mentions: Themes .pull-left[ Control the plot appearance/decorations ```r g + theme_classic() ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-36-1.png)<!-- --> ] --- ## Honorable Mentions: Themes .pull-left[ You can customize your own `theme()` ```r g + facet_grid(row = vars(species), cols = vars(year)) + theme( plot.background = element_blank(), panel.background = element_rect(fill = "white"), panel.grid = element_line(colour = "grey"), strip.background = element_rect(fill = "black"), strip.text = element_text(face = "bold", colour = "white", size = 13)) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-38-1.png)<!-- --> ] --- ## Grammar of Graphics Summarized: Lets Build A Plot! .pull-left[ ```r ggplot(data = penguins) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-40-1.png)<!-- --> ] --- ## Grammar of Graphics Summarized: Lets Build A Plot! .pull-left[ ```r ggplot(data = penguins, * aes(x = body_mass_g, * y = bill_length_mm, * colour = sex)) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-42-1.png)<!-- --> ] --- ## Grammar of Graphics Summarized: Lets Build A Plot! .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm, colour = sex)) + * geom_point() ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-44-1.png)<!-- --> ] --- ## Grammar of Graphics Summarized: Lets Build A Plot! .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm, colour = sex)) + geom_point() + * scale_color_manual( * values = c("#ee6f57","#1f3c88")) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-46-1.png)<!-- --> ] --- ## Grammar of Graphics Summarized: Lets Build A Plot! .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm, colour = sex)) + geom_point() + scale_color_manual( values = c("#ee6f57", "#1f3c88")) + * coord_flip() ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-48-1.png)<!-- --> ] --- ## Grammar of Graphics Summarized: Lets Build A Plot! .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm, colour = sex)) + geom_point() + scale_color_manual( values = c("#ee6f57", "#1f3c88")) + coord_flip() + * facet_wrap(~species) ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-50-1.png)<!-- --> ] --- ## Grammar of Graphics Summarized: Lets Build A Plot! .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm, colour = sex)) + geom_point() + scale_color_manual( values = c("#ee6f57", "#1f3c88")) + coord_flip() + facet_wrap(~species) + * labs(title = "Palmer Penguins") ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-52-1.png)<!-- --> ] --- ## Grammar of Graphics Summarized: Lets Build A Plot! .pull-left[ ```r ggplot(data = penguins, aes(x = body_mass_g, y = bill_length_mm, colour = sex)) + geom_point() + scale_color_manual( values = c("#ee6f57", "#1f3c88")) + coord_flip() + facet_wrap(~species) + labs(title = "Palmer Penguins") + * theme_minimal() ``` ] .pull-right[ ![](ggplot_presentation_files/figure-html/unnamed-chunk-54-1.png)<!-- --> ] --- ## Save your plot! .pull-left[ ggplot has a convenient function that allows you to: * save your plot * change graphic type * define the width and height of the plot * and more ] .pull-right[ ```r ggsave( filename = "plot_name.ext", plot = last_plot(), # Choose one from this list device = c("pdf", "jpeg", "tiff", "png", "svg"), # Path path = "path/to/folder" ) ``` ] --- ## Sources and Resources ggplot2: elegant graphics for data analysis - WIP 3rd-edition book freely accessible online! <br> ggplot2.tidyverse.org - Tidyverse reference documentation is A+ <br> Packages used in presentation ```r library(palmerpenguins) library(dplyr) library(ggplot2) ``` --- ## Waffles says thank you for attending! .pull-left[ <img src="supplemental_graphics/IMG-1561.jpg" width="75%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="supplemental_graphics/IMG-2982.jpg" width="75%" style="display: block; margin: auto;" /> ]