Do this before class
Class activities
Systembolaget’s assortment
Film events
- Exercises (training of arrange, filter, mutate, select, %>%)
Olympic winter medals
- Exercises (training of arrange, filter, mutate, select, %>%)
- Exercises (training of ggplot, geom_point, geom_line, facet_wrap)
Gapminder

Do this before class

Read R4DS chapters 3.1-3.6, 5.1-5.5

Complete assignments Data wrangling and Data visualization (first two chapters of Introduction to the Tidyverse) at DataCamp.

Class activities

Start by creating a new R-project “Classroom” that you will use for your class activities. Some activities will require data or scripts from the repo Class_files, we therefore recommend that you clone this into a subfolder of your Classroom directory. Now create a new R Markdown document Class1.Rmd where you will do your work for this class.

Systembolaget’s assortment

Systembolaget’s assortment of beverages from 2019-10-30 is available in the file Class_files/systembolaget2019-10-30.csv. It is downloaded from Systembolaget’s public API and saved in csv-format by the script Class_files/Systembolaget.R. Unfortunately , Systembolaget just changed how they share information via APIs (Nov, 1, 2022) due to usage of the product-related information that went against the purpose of Swedish alcohol policy and Systembolaget’s mission. Load the data by

# Define date when scraping took place - can then be easily changed.
date_systembolaget_scrape <- "2019-10-30"

library(tidyverse)
file_name <- paste0("systembolaget",date_systembolaget_scrape,".csv")
Sortiment_hela <- read_csv(file.path("Class_files", file_name))

Exercises (training of `arrange`, `filter`, `mutate`, `select`, `%>%`)

The variable Alkoholhalt (alcohol by volume) has been classified as character by read_delim, since it contains a percent sign. Convert it to numeric using mutate by first removing the percent sign (e.g. with gsub) and then transform with as.numeric.
A few wines are labelled as Röda - lägre alkoholhalt and Vita - lägre alkoholhalt instead of Rött vin (red wine) respektive Vitt vin (white wine) in the Varugrupp (group of products) column. Merge these wines into Rött vin and Vitt vin, respectively, e.g. by using mutate and ifelse.
What beverage has the highest PrisPerLiter? Display the answer (the Namn of the beverage) as dynamically coded in the text body of your .Rmd-document.
Create a new data frame Sortiment_ord with the regular product range (where SortimentText equals Ordinarie sortiment). Make a table (with kable from the knitr-library) of the 10 most expensive (PrisPerLiter) beverages from this range. Use select to select suitable columns for the table.
if you have not already done so, write the code from the previous excercise using a sequence of pipes (%>%).

Further excercises

Use your imagination and keep exploring the data.

Film events

The Stockholm international film festival takes place early November each year. In Class_files/Film_events_2018-11-07.csv you will find their event schedule for the 2018 edition.

Exercises (training of `arrange`, `filter`, `mutate`, `select`, `%>%`)

What films are already sold out (for all screenings)?
What venue screens the most number of (unique) films?
Plot the proportion of sold out events for each day of the festival.

Olympic winter medals

The file Class_files/Winter_medals2022-11-03.csv contains the number of medals per country and olympic year at the winter olympics since 1980 together with the total population of the country. The data set is scraped from Wikipedia using the script Class_files/Winter_medals.R which contains more information, in particular on countries that has been split or joined during the period.

Load the file using

winter_medals <- read_csv("class_files/Winter_medals2022-11-03.csv")

Exercises (training of `arrange`, `filter`, `mutate`, `select`, `%>%`)

Create a variable column medals_per_mill, the number of medals per million inhabitants.
Print a table of the 10 most successful countries, by medals_per_mill, during the 2022 Winter Olympics.

Gapminder

Use ggplot to recreate (static versions) of some figures from Hans Rosling’s talks. Data is available in the package gapminder.

Day 2: Tidyverse: Basic `dplyr` and `ggplot2`.

Do this before class

Class activities

Systembolaget’s assortment

Exercises (training of `arrange`, `filter`, `mutate`, `select`, `%>%`)

Exercises (training of `ggplot`, `geom_point`, `geom_line`, `facet_wrap`)

Further excercises

Film events

Exercises (training of `arrange`, `filter`, `mutate`, `select`, `%>%`)

Olympic winter medals

Exercises (training of `arrange`, `filter`, `mutate`, `select`, `%>%`)

Exercises (training of `ggplot`, `geom_point`, `geom_line`, `facet_wrap`)

Gapminder

Day 2: Tidyverse: Basic dplyr and ggplot2.

Do this before class

Class activities

Systembolaget’s assortment

Exercises (training of arrange, filter, mutate, select, %>%)

Exercises (training of ggplot, geom_point, geom_line, facet_wrap)

Further excercises

Film events

Exercises (training of arrange, filter, mutate, select, %>%)

Olympic winter medals

Exercises (training of arrange, filter, mutate, select, %>%)

Exercises (training of ggplot, geom_point, geom_line, facet_wrap)

Gapminder

Day 2: Tidyverse: Basic `dplyr` and `ggplot2`.

Exercises (training of `arrange`, `filter`, `mutate`, `select`, `%>%`)

Exercises (training of `ggplot`, `geom_point`, `geom_line`, `facet_wrap`)

Exercises (training of `arrange`, `filter`, `mutate`, `select`, `%>%`)

Exercises (training of `arrange`, `filter`, `mutate`, `select`, `%>%`)

Exercises (training of `ggplot`, `geom_point`, `geom_line`, `facet_wrap`)