从R中的混合效应线性回归模型中装入AIC值

时间:2017-04-18 01:22:33

标签: r matrix regression bootstrapping mixed-models

我已经重新采样了1000次替换数据集,现在想要为这1000个数据集中的每一个拟合三个模型并将其AIC分数打包。该程序的最终目标是获得所有模型中每个模型的平均AIC分数以及它们的95%置信区间。下面的代码有问题,我不知道我在哪里弄错了。所发生的是最终矩阵仅包含来自前几次迭代的AIC得分向量(即,不是全部1000)。在每次迭代中初始化主矩阵或向量的方式是否存在错误?或者我的行添加程序有缺陷?或者,如果代码是正确的,那么它是否可以将数据集添加到此代码中?如果后者是这种情况,那么当代码读入这些数据集并跳过它们时,为什么我没有收到错误?我已经困扰了好几天,我很困惑,所以任何帮助都会受到赞赏。

require(lme4)
require(lmerTest)

# initializing an empty matrix for storing each vector of AIC scores from each iteration
# the matrix has width 3 because three models are fitted at each iteration
AIC.scores = data.frame(matrix(, nrow = 0, ncol = 3))

#fit regression models to each of 1000 datasets
for(iter in 1:1000){

  #retrieving the data set, named accordingly, for the current iteration
  data = read.csv(paste("data_set_", iter,".csv", sep=""), header=TRUE)

  #initializing vector of AICs from models in current iteration
  AIC.score = vector(mode="numeric", length=3)

  mod1 = lmer(RT.log ~ crit.var1.log.std +
                       (1|Subject) +
                       (1|Item),
                          data = data,
                          REML=FALSE)

  AIC.score[1] = summary(mod1)$AIC[1]

  mod2 = lmer(RT.log ~ crit.var2.log.std +
                       (1|Subject) +
                       (1|Item),
                          data = data,
                          REML=FALSE)

  AIC.score[2] = summary(mod2)$AIC[1]

  mod3 = lmer(RT.log ~ crit.var3.log.std +
                       (1|Subject) +
                       (1|Item),
                          data = data,
                          REML=FALSE)

  AIC.score[3] = summary(mod3)$AIC[1]

  #adding vector of AICs scores from current iteration to main matrix
  AIC.scores = rbind(AIC.scores, t(AIC.score))

  cat("bagging iteration", iter, "completed!\n")

 }

#renaming column names in AIC score matrix
colnames(AIC.scores) = c("model1", "model2", "model3")

# function for calculating mean AIC and 95% C.I.s for each model across all  iterations
norm.interval = function(data, z=1.96) {
  mean = mean(data)
  variance = var(data)
  sd = sqrt(variance/length(data))
  c(mean, mean - z * sd, mean + z * sd)
}

for (i in 1:3) {

  cat("The mean, lCI, uCI for model", i, "are:", norm.interval(AIC.scores[,i]), "\n")

}

1 个答案:

答案 0 :(得分:0)

在黑暗中拍摄而不知道你的模型是什么或数据是什么。

将您的所有data.frame作为单个列表读取:

var availableMarketGroups = {};

angular.forEach(function (market) {

  if (availableMarketGroups[market.group_id]) { // market.group_id is not sorted id
       availableMarketGroups[market.group_id].count++;
  }  
});

根据以上数据列表,以文本形式生成lmer公式:

all.data <- lapply(paste0("data_set_", 1:1000, ".csv"), read.csv, header=TRUE)

执行lmer公式并将模型写入单个列表:

all.form <- lapply(paste0("all.data[[", 1:1000, "]]"), function(x) list(
  mod1 = paste0("lmer(RT.log ~ crit.var1.log.std +  (1|Subject) + (1|Item), REML=FALSE, data =", x, ")"),
  mod2 = paste0("lmer(RT.log ~ crit.var2.log.std +  (1|Subject) + (1|Item), REML=FALSE, data =", x, ")"),
  mod3 = paste0("lmer(RT.log ~ crit.var3.log.std +  (1|Subject) + (1|Item), REML=FALSE, data =", x, ")")
  )) 

提取lmer模型的所有AIC值:

all.lmer.mod <- lapply(all.form. function(x) lapply(x, function(y) eval(parse(text=y))))

请注意,如果您拥有大型数据框,并且您有许多主题和项目级别,则该过程将花费很长时间。首先将all.AIC <- lapply(all.lmer.mod, function(x) lapply(x, AIC)) 更改为1:1000,然后先测试1:2