贝叶斯统计:模拟R,上述实验10,000次,每次记录最长运行的长度

时间:2017-09-05 04:03:17

标签: r bayesian naivebayes

我试图在翻转硬币30次时找到10,000次模拟中最长跑的平均值。我需要在R中进行模拟,上面描述的实验10,000次,每次都记录最长运行的长度。

到目前为止,这是我的代码:

coin <- sample(c("H", "T"), 10000, replace = TRUE)
table(coin) 
head(coin, n = 30)
rle(c("H", "T", "T", "H", "H", "H", "H", "H", "T", "H"))
coin.rle <- rle(coin)
str(coin.rle)

如何在10,000次模拟中找到最长运行的平均值?

2 个答案:

答案 0 :(得分:2)

我认为以下是你所追求的目标。

n_runs <- 10000
max_runs <- numeric(n_runs)
for (j in 1:n_runs) {
 coin <- sample(c("H", "T"), 30, replace = TRUE) 
 max_runs[j] <- max(rle(coin)$length)
}
mean(max_runs)

有关代码的说明,最好检查coin的一小部分(例如coin[20])及其rlerle(coin[20]))。计算每个运行段的长度,因此max(rle(coin)$length)给出最大运行。

编辑:以下可能更快

len <- 30
times <- 10000

flips <- sample(c("H", "T"), len * times, replace = TRUE) 
runs <- sapply(split(flips, ceiling(seq_along(flips)/len)),
                    function(x) max(rle(x)$length))
mean(runs) # average of max runs
sum(runs >= 7)/ times # number of runs >= 7

答案 1 :(得分:1)

所有硬币翻转彼此独立(即,一次翻转的结果不影响另一次翻转)。因此,我们可以立即翻转所有模拟的所有硬币,然后以这样的方式进行格式化,这样可以更简单地总结每个30次翻转试验。以下是我将如何做到这一点。

# do all of the flips at once, this is okay because each flip
# is independent
coin_flips <- sample(c("heads", "tails"), 30 * 10000, replace = TRUE)

# put them into a 10000 by 30 matrix, each row
# indicates one 'simulation'
coin_matrix <- matrix(coin_flips, ncol = 30, nrow = 10000)

# we now want to iterate through each row using apply,
# to do so we need to make a function to apply to each
# row. This gets us the longest run over a single
# simulation
get_long_run <- function(x) {
  max(rle(x)$length)
}

# apply this function to each row
longest_runs <- apply(coin_matrix, 1, get_long_run)

# get the number of simulations that had a max run >= 7. Divide this
# by the number of simulations to get the probability of this occuring.
sum(longest_runs >= 7)/nrow(coin_matrix)

你应该得到18-19%之间的东西,但每次尝试这种模拟时这会有所不同。