Question

我是R的初学者，我正在进行模拟研究，我设法制作了一个我想做的样本。但是，我不知道如何复制我所做的事情。

这是我到目前为止编写的程序：

I <- 500       # number of observations
J <- 18        # total number of items
K <- 6         # number of testlets
JK <-3         # number of items within a testlet
response <- matrix(0, I, J)  # null binary (0, 1) response matrix 
unit <- matrix(1, JK, 1)     # unit vector

set.seed(1234)

# Multidimensional 3-pl model
pij <- function(a,b,c,theta,gamma) {c+(1-c)*(1/(1+exp(-1.7*a*(theta-b-gamma))))}

# Assigning a and b parameter values
a <- c(.8,.9,.7,.8,.9,.7,.8,.9,.7,.8,.9,.7,.8,.9,.7,.8,.9,.7)
b <-c(1,0,-1.5,1,0,-1.5,1,0,-1.5,1,0,-1.5,1,0,-1.5,1,0,-1.5)
# Assigning c-parameter, each 3 items (c-parameter & testlet effect)
#(small&small, small&large, large&small, large&large, mixed&small, mixed&large)
c <- c(.2,.2,.2,.2,.2,.2,.5,.5,.5,.5,.5,.5,.2,.33,.5,.2,.33,.5)    

theta <- rnorm(I, 0, 1)   # random sampling theta-values from normal dist. M=0, SD=1

gamma1 <- rnorm(I, 0, .2)  # small testlet effect: random sampling gamma from normal dist. M=0, SD=.2
gamma2 <- rnorm(I, 0, 1)   # large testlet effect: random sampling gamma from normal dist. M=0, SD=1
gamma3 <- rnorm(I, 0, .2)  # small testlet effect: random sampling gamma from normal dist. M=0, SD=.2
gamma4 <- rnorm(I, 0, 1)   # large testlet effect: random sampling gamma from normal dist. M=0, SD=1
gamma5 <- rnorm(I, 0, .2)  # small testlet effect: random sampling gamma from normal dist. M=0, SD=.2
gamma6 <- rnorm(I, 0, 1)   # large testlet effect: random sampling gamma from normal dist. M=0, SD=1

# implementing that the testlet effect is same for the items within a testlet
gamma1T <- gamma1 %*% t(unit)
gamma2T <- gamma2 %*% t(unit)
gamma3T <- gamma3 %*% t(unit)
gamma4T <- gamma4 %*% t(unit)
gamma5T <- gamma5 %*% t(unit)
gamma6T <- gamma6 %*% t(unit)

gammaT <- matrix(c(gamma1T, gamma2T, gamma3T, gamma4T, gamma5T, gamma6T), I, J)  # getting all the gammas together in a large matrix

# Generating data using the information above
for(i in 1:I) {
  for(j in 1:J) {
    response[i, j] <- ifelse(pij(a=a[j], b=b[j], c=c[j], theta=theta[i], gamma=gammaT[i,j]) < runif(1), 0, 1)
  }
}

因此，我得到一个数据集“响应”。我想要做的是复制这个，并说，1000“响应”数据集。我认为这可以通过复制“theta”和“gamma”的随机抽样来完成，但我不知道实际上是这样做的。

很多，非常感谢提前，

Hanjoe。

Answer 1

Stedy的建议是合理的，除了一件事：不要增加for循环中的种子。

正如我理解Stedy的建议，set.seed(i)将在每个模拟的for循环内调用，i在每次迭代中递增。由于生成的序列之间的相关性，该策略known引入（可能很大）偏差。

相反，在开头设置种子，即在for循环之前。例如，您可以使用当前的Unix时间作为种子，或者从具有随机数的文件中读取一个（例如来自random.org）。此外，请确保将种子与结果一起存储，例如：将其打印到日志文件中。如果要再次重现上一组复制的完全结果，则只需设置相应的种子。

如果您希望其他人能够完全复制您的结果，您还应该指定您使用的R版本（可能是操作系统）（因为RNG实施可能会有所不同）。

另外，模拟复制是embarassingly parallel任务，即如果你有一台多核机器（例如rparallel），你可以轻松地并行执行复制。但是，在这种情况下，需要额外注意随机数（例如，请参阅this paper）。

Answer 2

我会使用局部变量并将它们变成一个函数。然后创建一个for()循环，调用函数并在每次为set.seed()循环的长度调用函数时将for()递增一。

如何复制我的模拟研究

2 个答案: