我试图在翻转硬币30次时找到10,000次模拟中最长跑的平均值。我需要在R中进行模拟,上面描述的实验10,000次,每次都记录最长运行的长度。
到目前为止,这是我的代码:
coin <- sample(c("H", "T"), 10000, replace = TRUE)
table(coin)
head(coin, n = 30)
rle(c("H", "T", "T", "H", "H", "H", "H", "H", "T", "H"))
coin.rle <- rle(coin)
str(coin.rle)
答案 0 :(得分:2)
我认为以下是你所追求的目标。
n_runs <- 10000
max_runs <- numeric(n_runs)
for (j in 1:n_runs) {
coin <- sample(c("H", "T"), 30, replace = TRUE)
max_runs[j] <- max(rle(coin)$length)
}
mean(max_runs)
有关代码的说明,最好检查coin
的一小部分(例如coin[20]
)及其rle
(rle(coin[20])
)。计算每个运行段的长度,因此max(rle(coin)$length)
给出最大运行。
编辑:以下可能更快
len <- 30
times <- 10000
flips <- sample(c("H", "T"), len * times, replace = TRUE)
runs <- sapply(split(flips, ceiling(seq_along(flips)/len)),
function(x) max(rle(x)$length))
mean(runs) # average of max runs
sum(runs >= 7)/ times # number of runs >= 7
答案 1 :(得分:1)
所有硬币翻转彼此独立(即,一次翻转的结果不影响另一次翻转)。因此,我们可以立即翻转所有模拟的所有硬币,然后以这样的方式进行格式化,这样可以更简单地总结每个30次翻转试验。以下是我将如何做到这一点。
# do all of the flips at once, this is okay because each flip
# is independent
coin_flips <- sample(c("heads", "tails"), 30 * 10000, replace = TRUE)
# put them into a 10000 by 30 matrix, each row
# indicates one 'simulation'
coin_matrix <- matrix(coin_flips, ncol = 30, nrow = 10000)
# we now want to iterate through each row using apply,
# to do so we need to make a function to apply to each
# row. This gets us the longest run over a single
# simulation
get_long_run <- function(x) {
max(rle(x)$length)
}
# apply this function to each row
longest_runs <- apply(coin_matrix, 1, get_long_run)
# get the number of simulations that had a max run >= 7. Divide this
# by the number of simulations to get the probability of this occuring.
sum(longest_runs >= 7)/nrow(coin_matrix)
你应该得到18-19%之间的东西,但每次尝试这种模拟时这会有所不同。