Read R4DS chapter 7 about exploratory data analysis.
Solve Exploring Categorical Data and Exploring Numerical Data of the Exploratory Data Analysis course at DataCamp.
Use filter
to extract the groups of products
c("Vitt vin", "Rött vin", "Rosévin", "Mousserande vin")
of
vintage 2011-2018. Try and compare the following bar charts
ggplot
with aes(x = Argang)
,
geom_bar()
andggplot
with aes(x = Argang)
,
geom_bar()
and facet_wrap(~ Varugrupp)
(try
adding scale = "free_y"
to facet_wrap
)ggplot
with
aes(x = Argang, fill = Varugrupp)
and
geom_bar()
geom_bar(position = "dodge")
geom_bar(position = "fill")
Recreate the following plot (Red wines in the regular range)
Make a box_plot
of PrisPerLiter
on the
log-scale,with x = Varugrupp
. Try coord_flip
to improve readability.
The following code transforms the medals data to “long” format (more
about this next time!) which is easier to work with in
ggplot
:
medal_long <- read_csv("../class_files/Winter_medals2019-10-30.csv") %>%
select(-Total) %>%
pivot_longer(cols = c("Gold", "Silver", "Bronze"),
names_to = "Medal",
values_to = "Number")
Check the result with glimpse(medal_long)
. Use
group_by
and summarise
in order to aggregate
the total number of medals (Gold/Silver/Bronze) for each country.
Illustrate the relative proportions of medals, e.g. by
geom_bar
with stat = "identity
and
position = "fill"
.
The file class_files/MM2001_results.csv
contains the age, sex, and grade on course Matematik I (MM2001) of 3201
students aged 18-40 years. An NA
in the grade column means
that the student has been registered but not yet completed the
course.
Use ggplot
to explore relations between the
variables.