Question

在R中，如果只有概率密度函数可用，那么模拟任意单变量随机变量的最佳方法是什么？

Answer 1

当给出密度时，这是逆cdf方法的（慢）实现。

den<-dnorm #replace with your own density

#calculates the cdf by numerical integration
cdf<-function(x) integrate(den,-Inf,x)[[1]]

#inverts the cdf
inverse.cdf<-function(x,cdf,starting.value=0){
 lower.found<-FALSE
 lower<-starting.value
 while(!lower.found){
  if(cdf(lower)>=(x-.000001))
   lower<-lower-(lower-starting.value)^2-1
  else
   lower.found<-TRUE
 }
 upper.found<-FALSE
 upper<-starting.value
 while(!upper.found){
  if(cdf(upper)<=(x+.000001))
   upper<-upper+(upper-starting.value)^2+1
  else
   upper.found<-TRUE
 }
 uniroot(function(y) cdf(y)-x,c(lower,upper))$root
}

#generates 1000 random variables of distribution 'den'
vars<-apply(matrix(runif(1000)),1,function(x) inverse.cdf(x,cdf))
hist(vars)

Answer 2

澄清上面“使用Metropolis-Hastings”的答案：

假设ddist()是你的概率密度函数

类似的东西：

n <- 10000
cand.sd <- 0.1
init <- 0
vals <- numeric(n)
vals[1] <- init 
oldprob <- 0
for (i in 2:n) {
    newval <- rnorm(1,mean=vals[i-1],sd=cand.sd)
    newprob <- ddist(newval)
    if (runif(1)<newprob/oldprob) {
        vals[i] <- newval
    } else vals[i] <- vals[i-1]
   oldprob <- newprob
}

注意：

完全未经测试
效率取决于候选分布（即cand.sd的值）。为了获得最大效率，请将cand.sd调整为25-40％的接受率
结果将是自相关的...（虽然我猜你可以永远 sample()加扰他们的结果，或者瘦身）
可能需要丢弃“老化”

这个问题的经典方法是拒绝抽样（参见例如Press et al Numerical Recipes ）

Answer 3

使用累积分布函数http://en.wikipedia.org/wiki/Cumulative_distribution_function

然后只使用它的逆。点击此处查看更好的图片http://en.wikipedia.org/wiki/Normal_distribution

这意味着：从[0,1]中选择随机数并设置为CDF，然后检查值

它也被称为分位数函数。

Answer 4

这是一条评论，但我没有足够的声誉对本·博克（Ben Bolker）的回答发表评论。

我是Metropolis的新手，但是恕我直言，此代码是错误的，因为：

a）newval是从正态分布中绘制的，而在其他代码中，newval是从均匀分布中绘制的；此值必须从随机数覆盖的范围中得出。例如，对于高斯分布，它应该类似于runif（1，-5，+5）。

b）概率值只有在被接受时才必须更新。

希望获得帮助，并希望有名望的人可以纠正此答案（尤其是我错了的我）。

# the distribution ddist <- dnorm # number of random number n <- 100000 # the center of the range is taken as init init <- 0 # the following should go into a function vals <- numeric(n) vals[1] <- init oldprob <- 0 for (i in 2:n) { newval <- runif(1, -5, +5) newprob <- ddist(newval) if (runif(1) < newprob/oldprob) { vals[i] <- newval oldprob <- newprob } else vals[i] <- vals[i-1] } # Final view hist(vals, breaks = 100) # and comparison hist(rnorm(length(vals)), breaks = 100)

如何使用其概率函数最好地模拟任意单变量随机变量？

4 个答案: