class: language-r layout: true --- # Lab 1 — September 20 ## Question 1 ### Load the tidyverse ```r library(tidyverse) ``` <PRE class="fansi fansi-message"><CODE>## -- <span style='font-weight: bold;'>Attaching packages</span> --------------------------------------- tidyverse 1.3.1 -- </CODE></PRE><PRE class="fansi fansi-message"><CODE>## <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>ggplot2</span> 3.3.6 <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>purrr </span> 0.3.4 ## <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>tibble </span> 3.1.6 <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>dplyr </span> 1.0.9 ## <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>tidyr </span> 1.2.0 <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>stringr</span> 1.4.0 ## <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>readr </span> 2.1.2 <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>forcats</span> 0.5.1 </CODE></PRE><PRE class="fansi fansi-message"><CODE>## -- <span style='font-weight: bold;'>Conflicts</span> ------------------------------------------ tidyverse_conflicts() -- ## <span style='color: #BB0000;'>x</span> <span style='color: #0000BB;'>dplyr</span>::<span style='color: #00BB00;'>filter()</span> masks <span style='color: #0000BB;'>stats</span>::filter() ## <span style='color: #BB0000;'>x</span> <span style='color: #0000BB;'>dplyr</span>::<span style='color: #00BB00;'>lag()</span> masks <span style='color: #0000BB;'>stats</span>::lag() </CODE></PRE> --- # Change plotting theme ```r theme_set(theme_bw()) ``` - I am calling `theme_set(theme_bw())` so that my plots created with ggplot2 will have a white background, rather than the default gray background - See [here](https://ggplot2.tidyverse.org/reference/ggtheme.html#examples) for ggplot2's built-in themes and previews of these themes --- # Read in the data - Tidyverse is a *collection* of packages - Calling `library(tidyverse)` conveniently loads eight packages (though you likely won't be using all eight) - The functions that you will be using for your analyses will be contained in these eight packages, not in the `tidyverse` package itself - The function to be used to read in csv files is `read_csv`, found in the `readr` package - The corresponding documentation can be accessed using either: ```r help(read_csv, package="readr") ?readr::read_csv ``` --- # Read in the data ```r weather <- read_csv("./data/ottawa-all-monthly.csv") ``` <PRE class="fansi fansi-message"><CODE>## <span style='font-weight: bold;'>Rows: </span><span style='color: #0000BB;'>1405</span> <span style='font-weight: bold;'>Columns: </span><span style='color: #0000BB;'>9</span> ## <span style='color: #00BBBB;'>--</span> <span style='font-weight: bold;'>Column specification</span> <span style='color: #00BBBB;'>--------------------------------------------------------</span> ## <span style='font-weight: bold;'>Delimiter:</span> "," ## <span style='color: #BB0000;'>chr</span> (1): Month ## <span style='color: #00BB00;'>dbl</span> (7): Year, Mean.Max.Temp.C., Mean.Min.Temp.C., Mean.Temp.C., Total.Rain... ## <span style='color: #0000BB;'>date</span> (1): Date.Time ## ## <span style='color: #00BBBB;'>i</span> Use `spec()` to retrieve the full column specification for this data. ## <span style='color: #00BBBB;'>i</span> Specify the column types or set `show_col_types = FALSE` to quiet this message. </CODE></PRE> - I do not recommend setting `show_col_types = FALSE` - If any columns were not read in as the correct type, you should deal with them immediately (or at least be aware) - Example: notice that `Month` was read in as a column of type character! --- # Preview the data ```r head(weather) ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 6 x 9</span> ## Date.Time Year Month Mean.Max.Temp.C. Mean.Min.Temp.C. Mean.Temp.C. ## <span style='color: #555555; font-style: italic;'><date></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> 1889-11-01 <span style='text-decoration: underline;'>1</span>889 11 4.6 -<span style='color: #BB0000;'>2.2</span> 1.2 ## <span style='color: #555555;'>2</span> 1889-12-01 <span style='text-decoration: underline;'>1</span>889 12 -<span style='color: #BB0000;'>1.7</span> -<span style='color: #BB0000;'>9.6</span> -<span style='color: #BB0000;'>5.7</span> ## <span style='color: #555555;'>3</span> 1890-01-01 <span style='text-decoration: underline;'>1</span>890 01 -<span style='color: #BB0000;'>4.2</span> -<span style='color: #BB0000;'>15.5</span> -<span style='color: #BB0000;'>9.9</span> ## <span style='color: #555555;'>4</span> 1890-02-01 <span style='text-decoration: underline;'>1</span>890 02 -<span style='color: #BB0000;'>3.7</span> -<span style='color: #BB0000;'>13.9</span> -<span style='color: #BB0000;'>8.8</span> ## <span style='color: #555555;'>5</span> 1890-03-01 <span style='text-decoration: underline;'>1</span>890 03 -<span style='color: #BB0000;'>0.3</span> -<span style='color: #BB0000;'>9.3</span> -<span style='color: #BB0000;'>4.8</span> ## <span style='color: #555555;'>6</span> 1890-04-01 <span style='text-decoration: underline;'>1</span>890 04 9.9 -<span style='color: #BB0000;'>1.2</span> 4.4 ## <span style='color: #555555;'># ... with 3 more variables: Total.Rain.mm. <dbl>, Total.Snow.cm. <dbl>,</span> ## <span style='color: #555555;'># Total.Precip.mm. <dbl></span> </CODE></PRE> --- # Inspect the data ```r dim(weather) ``` ``` ## [1] 1405 9 ``` ```r nrow(weather) ``` ``` ## [1] 1405 ``` ```r ncol(weather) ``` ``` ## [1] 9 ``` --- # Inspect the data ```r names(weather) ``` ``` ## [1] "Date.Time" "Year" "Month" "Mean.Max.Temp.C." ## [5] "Mean.Min.Temp.C." "Mean.Temp.C." "Total.Rain.mm." "Total.Snow.cm." ## [9] "Total.Precip.mm." ``` ```r length(names(weather)) ``` ``` ## [1] 9 ``` --- # September ```r September <- weather %>% filter(Month == "09") ``` --- # Plot 1 .pull-left[ ```r ggplot(September, aes(x=Date.Time, y=Mean.Max.Temp.C.))+ geom_point() ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-10-1.svg" style="display: block; margin: auto;" /> ] --- # Plot 2 .pull-left[ ```r ggplot(September, aes(x=Date.Time, y=Mean.Max.Temp.C.))+ geom_point()+ geom_smooth(method = "loess", formula = y ~ x) ``` - Specifying `method = "loess"` and `formula = y ~ x` are optional, but a message will be shown that this is what is used ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-12-1.svg" style="display: block; margin: auto;" /> ] --- # Plot 3 .pull-left[ ```r ggplot(September, aes(x=Date.Time, y=Mean.Max.Temp.C.))+ geom_point()+ geom_line(aes(colour=Year)) ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-14-1.svg" style="display: block; margin: auto;" /> ] --- # Plot 4 .pull-left[ ```r ggplot(September, aes(x=Date.Time, y=Mean.Max.Temp.C.))+ geom_point()+ geom_line(aes(colour=Year))+ geom_smooth(method = "loess" ,formula = y ~ x) ``` - Specifying `method = "loess"` and `formula = y ~ x` are optional, but a message will be shown that this is what is used ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-16-1.svg" style="display: block; margin: auto;" /> ] --- # Question 2 ## October ```r October <- weather %>% filter(Month == "10") ``` --- # Combine September and October ```r SeptOct <- bind_rows(September, October) dim(SeptOct) ``` ``` ## [1] 234 9 ``` --- # Select columns ```r SeptOct <- SeptOct %>% select(Year, Month, Mean.Max.Temp.C., Mean.Min.Temp.C., Total.Rain.mm., Total.Snow.cm.) ``` --- # Make new variable ```r SeptOctnew <- SeptOct %>% mutate(Total.precip = Total.Rain.mm. + Total.Snow.cm. * 10) ``` --- # Plot 1 .pull-left[ ```r ggplot(SeptOctnew, aes(x=Year, y=Total.precip))+ geom_point() ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-22-1.svg" style="display: block; margin: auto;" /> ] --- # Plot 2 .pull-left[ ```r ggplot(SeptOctnew, aes(x=Year, y=Total.precip))+ geom_point()+ geom_smooth(method = "loess", formula = y ~ x) ``` - Specifying `method = "loess"` and `formula = y ~ x` are optional, but a message will be shown that this is what is used ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-24-1.svg" style="display: block; margin: auto;" /> ] --- # Plot 3 .pull-left[ ```r ggplot(SeptOctnew, aes(x=Year, y=Total.precip))+ geom_point()+ geom_line(aes(colour=Year)) ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-26-1.svg" style="display: block; margin: auto;" /> ] --- # Plot 4 .pull-left[ ```r ggplot(SeptOctnew, aes(x=Year, y=Total.precip))+ geom_point()+ geom_line(aes(colour=Year))+ geom_smooth(method = "loess", formula = y ~ x) ``` - Specifying `method = "loess"` and `formula = y ~ x` are optional, but a message will be shown that this is what is used ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-28-1.svg" style="display: block; margin: auto;" /> ] --- # New plot .pull-left[ ```r ggplot(SeptOctnew, aes(x=Total.precip, y=Mean.Max.Temp.C.))+ geom_point()+ geom_smooth(method = "loess", formula = y ~ x) ``` - Specifying `method = "loess"` and `formula = y ~ x` are optional, but a message will be shown that this is what is used ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-30-1.svg" style="display: block; margin: auto;" /> ] --- # Fix that code .pull-left[ ```r ggplot(SeptOctnew, aes(x=Year, y=Total.precip), colour=Month)+ geom_point()+ geom_smooth(y ~ x) ``` <PRE class="fansi fansi-error"><CODE>## <span style='color: #BBBB00; font-weight: bold;'>Error</span><span style='font-weight: bold;'> in `validate_mapping()`:</span> ## <span style='color: #BBBB00;'>!</span> `mapping` must be created by `aes()` </CODE></PRE> ] .pull-right[ - Colour should be defined within the aesthetics - `y ~ x` should be passed to the `formula` argument - The `y ~ x` is currently being passed to the `mapping` argument ] --- # Fix that code .pull-left[ ```r ggplot(SeptOctnew, aes(x=Year, y=Total.precip, colour=Month))+ geom_point()+ geom_smooth(method = "loess", formula = y ~ x) ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-33-1.svg" style="display: block; margin: auto;" /> ] --- # Facetting .pull-left[ ```r weather %>% filter(Month == "09" | Month == "10" | Month == "11") %>% mutate(Precipitation = Total.Rain.mm. + 25 * Total.Snow.cm.) %>% ggplot(aes(x=Year, y=Precipitation))+ geom_line(aes(colour=Month))+ geom_smooth(method = "loess", formula = y ~ x)+ facet_wrap(~Month) ``` - Specifying `method = "loess"` and `formula = y ~ x` are optional, but a message will be shown that this is what is used ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-35-1.svg" style="display: block; margin: auto;" /> ] --- # Better facetting .pull-left[ ```r weather %>% filter(Month %in% c("09", "10", "11")) %>% mutate(Precipitation = Total.Rain.mm. + 25 * Total.Snow.cm.) %>% ggplot(aes(x=Year, y=Precipitation))+ geom_line()+ geom_smooth(method = "loess", formula = y ~ x)+ facet_wrap(~Month, nrow=3, labeller="label_both") ``` - Can filter the months using less typing - Don't need to colour by month anymore since we are facetting by month - Clarify that the variable being facetted is `Month` using `labeller = "label_both"`, though an even better option would be to use the full month names, e.g. September, October, November - Change dimensions to 3x1 since time data is usually wide - Specifying `method = "loess"` and `formula = y ~ x` are optional, but a message will be shown that this is what is used ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-37-1.svg" style="display: block; margin: auto;" /> ]