生成n个样本,R中的拒绝采样

时间:2018-01-04 13:51:08

标签: r statistics normal-distribution resampling statistical-sampling

拒绝抽样

我使用截断正态分布的拒绝采样,请参阅下面的r代码。如何在特定的n处停止采样?例如1000次观察。 即我想在接受的样本数量达到n(1000)时停止采样。

有什么建议吗?非常感谢任何帮助:)

#Truncated normal curve    
curve(dnorm(x, mean=2, sd=2)/(1-pnorm(1, mean=2, sd=2)),1,9)

#create a data.frame with 100000 random values between 1 and 9

sampled <- data.frame(proposal = runif(100000,1,9))
sampled$targetDensity <- dnorm(sampled$proposal, mean=2, sd=2)/(1-pnorm(1, mean=2, sd=2))

#accept proportional to the targetDensity

maxDens = max(sampled$targetDensity, na.rm = T)
sampled$accepted = ifelse(runif(100000,0,1) < sampled$targetDensity / maxDens, TRUE, FALSE)

hist(sampled$proposal[sampled$accepted], freq = F, col = "grey", breaks = 100, xlim = c(1,9), ylim = c(0,0.35),main="Random draws from skewed normal, truncated at 1")
curve(dnorm(x, mean=2, sd=2)/(1-pnorm(1, mean=2, sd=2)),1,9, add =TRUE, col = "red", xlim = c(1,9),  ylim = c(0,0.35))



X <- sampled$proposal[sampled$accepted]

当我采样时,如何将X的长度设置为特定的数字?

1 个答案:

答案 0 :(得分:0)

在睡觉之后,如果你决定使用拒绝采样并且只在1000次过去之后才这样做,我认为没有比使用while循环更好的选择。这比

效率低得多
sampled$accepted = ifelse(runif(100000,0,1) < sampled$targetDensity / maxDens, TRUE, FALSE)
X <- sampled$proposal[sampled$accepted][1:1000]

上述代码所用的时间为0.0624001s。以下代码所需的时间为0.780005s。我加入它是因为它是您提出的具体问题的答案,但这种方法效率低下。如果还有其他选择,我会使用它。

#Number of samples
N_Target <- 1000
N_Accepted <- 0

#Loop until condition is met
i = 1
sampled$accepted = FALSE
while( N_Accepted < N_Target ){

    sampled$accepted[i] = ifelse(runif(1,0,1) < sampled$targetDensity[i] / maxDens, TRUE, FALSE)
    N_Accepted = ifelse( sampled$accepted[i], N_Accepted + 1 , N_Accepted )
    i = i + 1
    if( i > nrow( sampled ) ) break

}