如何用R模拟相关的二进制数据?

时间:2013-04-18 17:09:43

标签: r binary simulation correlation

假设我想要2个具有指定phi系数的二进制数据向量,我怎么能用R模拟它?

例如,如何创建两个向量,如指定向量长度的xy,其效率为0.79

> x = c(1,  1,  0,  0,  1,  0,  1,  1,  1)
> y = c(1,  1,  0,  0,  0,  0,  1,  1,  1)
> cor(x,y)
[1] 0.7905694

1 个答案:

答案 0 :(得分:11)

bindata 包很适合用这个和更复杂的相关结构生成二进制数据。 (Here's a link to a working paper (warning, pdf)列出了包裹作者采用的方法的基础理论。)

在您的情况下,假设x和y的独立概率都是0.5:

library(bindata)

## Construct a binary correlation matrix
rho <- 0.7905694
m <- matrix(c(1,rho,rho,1), ncol=2)   

## Simulate 10000 x-y pairs, and check that they have the specified
## correlation structure
x <- rmvbin(1e5, margprob = c(0.5, 0.5), bincorr = m) 
cor(x)
#           [,1]      [,2]
# [1,] 1.0000000 0.7889613
# [2,] 0.7889613 1.0000000