Read R4DS chapter 19.
Solve the course Introduction to Writing Functions in R at Datacamp.
Complete the exercises in R4DS sections 19.2.1, 19.3.1, 19.4.4, 19.5.5.
Functional programming is a vast (and quite complicated) subject. It is quite different to the usual concept of object oriented programming. Although you should know what functional programming is, we wont cover much about the concept other than the basics.
As the name entails, functions are a big part of functional
programming. In R, functions are objects and consist of three
components. The arguments
(or formals
if you
want to be fancy), a body
and the environment
.
The arguments are exactly what the describe, though they may not need be
defined! The body is what the function is supposed to do and its
environment is what is used
With the question below we are going to take a look at how functions can help with readability but also adaptability.
...
!If you take a look at the R help for the mean
function
you can see the following entry under Arguments
... further arguments passed to or from other methods.
The ...
is essentially a “anything and everything”-type
of argument. If you are used to Python this is kind of the same as
**kwargs
. Anything can go in there, which is really
powerful but also quite worrying. How do we know if the arguments we
pass to ...
will ever be used? Take the following (trivial)
example
x <- runif(10)
mean(x)
mean(x, this_question_is_useless_i_hate_you_felix = FALSE)
Try to answer the following questions:
Do you expect the same output and is it?
Do functions need arguments?
What happens if you misspell the actual argument
trim
with say, trm
?
We are going to use this feature to play games. We use it to set up how a game is played and with what probabilities. Implement the function below according to its instructions.
simulate_round <- function(probability_fun, ...){
# Function which specifies a round of a game. It will
# simulate an outcome from a Bernoulli distribution whose
# probability of sucess is determined by the function
# probability_fun. The arguments of the function is not yet
# determined which must be accounted for.
}
We are going to simulate a game where there are two teams who play against each other. The game consists of a number of rounds. If a team wins one round then the team gets one point. The win conditions (e.g. when one team wins against the other) are
16, 19, 22, 25, ...
,To clarify, the first team who gets 16 points wins, given that the other team has less than 15 points. If the score is 15-15 then the teams will continue playing. Given that the score is 15-15, the first team to get 19 points wins, given that the other team has less than 18. If the score is 18-18 then the teams will continue and so the match continues.
Implement the following function which determines if a game is finished or not
win_condition_achieved <- function(scoreA, scoreB) {
# Function which determines if a win condition is achieved.
# It takes the score of player A and B and returns a bool
# TRUE/FALSE if the game has ended, e.g. someone has won.
}
Our interest is the team-specific conditional distribution(s) of scores given a team wins. For us to determine these quantities we need a little background of Monte Carlo integration. If you feel like you know how Monte carlo integration works, feel free to skip it.
A very brief introduction to Monte Carlo integration
If the stochastic variable \(X\) is
follows a standard uniform distribution, then \(E(f(X)) =\int_a ^b f(x)dx\). This can be
used in order to approximate integrals since \[
E(f(X)) \approx \frac{1}{N} \sum_i^N f(x_i)
\] That is, we simulate a large number \(x_1,…,x_N\) and evaluate the function \(f\) accordingly. By the Law of Large
Numbers, \(\sum_{i=1}^N f(x_i)/N\) will
converge to the expectation and can thus be viewed as a numerical
approximation of the integral. Hence, a probability can be evaluated
according to
\[
P(X\leq x) = E(\mathbf{1}(X \leq x)) \approx \frac{1}{N} \sum_i^N
\mathbf{1}(x_i \leq x).
\]
The key takeaway is that to approximate anything involving an integral, we can write it as as an expectation and compute it through simulation using Monte Carlo integration. So even for “very complicated” events we can always approximate their probability by simulation! We are going to use our functions to simulate the game. First thing is to determine how the probability is going to be calculated. Remember that its always best to start simple.
probability_fun <- function(scoreA, scoreB, ...) {
# code
}
For each simulation iteration we want to keep track of the score of
the team A, the score of team B and who won. Store these in the matrix
results
. Use all functions we have described above to
simulate a number of repetitions (simulations) of the game. While no one
has won, we need to
stop
, save the
outcomes, otherwise continue with 1.nsims <- 100 # change to a larger number if you want though it might take some time
results <- matrix(nrow=nsims, ncol=3)
for (x in 1:nsims) {
game_won <- FALSE
scoreA <- 0
scoreB <- 0
while (!game_won) {
# code
}
# save outcomes
}
colMeans(results)
Try to answer the following questions:
colMeans(results)
represent?
What are they trying to approximate?probability_fun
return
0.5
or 0.51
regardless of the score? How does
the distribution of number of rounds played look? What happens if
probability_fun
depends on the score (e.g. it might return
(scoreA+0.01) / (scoreA + ScoreB + 0.02)
)? Could you think
of a probbility_fun
that leads to a large expected number
of rounds played?