我想在R
中模拟来自以下模型的数据Y ~ N(b0 + b1*X, sigma)
并在R
中拟合以下模型lm(Y ~ 1 + X, data)
这里大致是R代码,
nsims = 1000
X = 1:50
b0 = rnorm(nsims, 55.63, 31.40)
b1 = rnorm(nsims, 1.04, .39)
sigma = rnorm(nsims, 11.34, 4.11)
问题在于我希望b0
,b1
和sigma
相关联。我希望他们有这个相关性。
R <- matrix(c(1, .16, .54,
.16, 1, .13,
.54, .13, 1),
nrow = 3)
colnames(R) <- c("b0", "b1", "sigma")
现在我想要这种关联结构,上面的rnorm
代码是错误的。如果我的数据不需要这个相关矩阵,我可能会做以下事情,
sim_data <- data.frame()
for(i in 1:nsims){
Y = b0[i] + b1[i]*X + rnorm(length(X), 0, sigma[i])
data_tmp <- data.frame(Y = Y, X = X, ID = i)
sim_data <- rbind(sim_data, data_tmp)
}
但由于我生成参数的方式,这忽略了我的相关结构。任何人都可以给我一些建议或指针,寻找如何结合相关性吗?
答案 0 :(得分:2)
模拟三维正态分布并从中获取变量。您可以将MASS
包用于多变量模拟,并使用MBESS
包进行从mvrnorm
函数中所需的相关性到协方差矩阵的转换。
library(MASS)
library(MBESS)
R <- matrix(c(1, .16, .54,
.16, 1, .13,
.54, .13, 1),
nrow = 3)
SD <- c(31.40, .39, 4.11)
## convert correlation matrix to covariance matrix
Cov <- cor2cov(R, SD)
### you can also do it algebraically without MBESS package
### Cov <- SD %*% t(SD) * R
### where %*% is matrix multiplication and * is normal multiplication
### t() is transpose function
# simulate multivariate normal distribution
mvnorm <- mvrnorm(
1000,
mu = c(55.63, 1.04, 11.34),
Sigma = Cov,
empirical = T
)
# check whether correlation matrix is right
cor(mvnorm)
[,1] [,2] [,3]
[1,] 1.00 0.16 0.54
[2,] 0.16 1.00 0.13
[3,] 0.54 0.13 1.00
# extract variables
b0 <- mvnorm[, 1]
b1 <- mvnorm[, 2]
sigma <- mvnorm[, 3]