Read R4DS chapter 7 about exploratory data analysis.
Solve Exploring Categorical Data and Exploring Numerical Data of the Exploratory Data Analysis course at DataCamp.
Use filter to extract the groups of products
c("Vitt vin", "Rött vin", "Rosévin", "Mousserande vin") of
vintage 2011-2018. Try and compare the following bar charts
ggplot with aes(x = Argang),
geom_bar() andggplot with aes(x = Argang),
geom_bar() and facet_wrap(~ Varugrupp) (try
adding scale = "free_y" to facet_wrap)ggplot with
aes(x = Argang, fill = Varugrupp) and
geom_bar()geom_bar(position = "dodge")geom_bar(position = "fill")Recreate the following plot (Red wines in the regular range)
Make a box_plot of PrisPerLiter on the
log-scale,with x = Varugrupp. Try coord_flip
to improve readability.
The following code transforms the medals data to “long” format (more
about this next time!) which is easier to work with in
ggplot:
medal_long <- read_csv("../class_files/Winter_medals2019-10-30.csv") %>%
select(-Total) %>%
pivot_longer(cols = c("Gold", "Silver", "Bronze"),
names_to = "Medal",
values_to = "Number")
Check the result with glimpse(medal_long). Use
group_by and summarise in order to aggregate
the total number of medals (Gold/Silver/Bronze) for each country.
Illustrate the relative proportions of medals, e.g. by
geom_bar with stat = "identity and
position = "fill".
The file class_files/MM2001_results.csv
contains the age, sex, and grade on course Matematik I (MM2001) of 3201
students aged 18-40 years. An NA in the grade column means
that the student has been registered but not yet completed the
course.
Use ggplot to explore relations between the
variables.