Question

我想为我的混合效果模型（与lmer()中的lme4匹配）选择最佳的随机结构。我从软件包stepcAIC()中找到了函数cAIC4，该函数应该用于比较模型并逐步选择具有最小AIC的模型。尽管实现看起来很简单，但还是出现了错误。

拟合模型后，我运行以下功能：

stepcAIC(model_full, direction="backward")

所以首先-它需要永远的运行。第二-我收到一条错误消息。我尝试明确指定数据集：

stepcAIC(model_full, direction="backward", data=data_correct)

我还尝试将R更新到最新版本，然后再次运行它，但这无济于事。

有人对这个功能有积极的经验告诉我我做错了什么吗？

我得到的错误是：

eval（predvars，data，env）中的错误：找不到对象'Color1'

我有一个名为“ Color”的变量，但没有“ Color1”。也许“ Color1”是从效果表中获取的名称，但是为什么它要使用汇总表中的名称并在数据框中搜索呢？

我也收到警告：

如果if（！hasInt（resForThisGroup））res [[i]] <-res [[i]] [-j]：条件的长度> 1，并且仅使用第一个元素

这是一个[链接]（https://drive.google.com/open?id=1jIJn2rzK3SwpKMfKGDhseYcOxinuwpue ）下载data_correct和model_full：

这就是我创建model_full的方式：

model_full <- lmer(data=data_correct, log_RT~Polarity+Delay+Truth_value+Type+Color+Order + Polarity:Delay + Polarity:Truth_value + Polarity:Order + Polarity:Type+ Polarity:Color + Delay:Truth_value+ Truth_value:Delay:Polarity + (1+Polarity*Color+Delay+Delay:Polarity+Truth_value|Subject), control=lmerControl(optimizer="bobyqa"), REML=FALSE)

这是model_full的输出：

Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log_RT ~ Polarity + Delay + Truth_value + Type + Color + Order +  
    Polarity:Delay + Polarity:Truth_value + Polarity:Order +  
    Polarity:Type + Polarity:Color + Delay:Truth_value + Truth_value:Delay:Polarity +  
    (1 + Polarity * Color + Delay + Delay:Polarity + Truth_value |          Subject)
   Data: data_correct
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
 16556.6  16896.2  -8235.3  16470.6    19838 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9078 -0.6585 -0.1065  0.5654  6.5045 

Random effects:
 Groups   Name             Variance  Std.Dev. Corr                               
 Subject  (Intercept)      0.0652479 0.25544                                     
          Polarity1        0.0045472 0.06743   0.51                              
          Color1           0.0030415 0.05515   0.15  0.13                        
          Delay1           0.0005240 0.02289   0.22 -0.05 -0.02                  
          Truth_value1     0.0022027 0.04693   0.00  0.48  0.23  0.00            
          Polarity1:Color1 0.0003927 0.01982   0.04 -0.33  0.57 -0.50 -0.12      
          Polarity1:Delay1 0.0001981 0.01408   0.61  0.07  0.06  0.55  0.06 -0.04
 Residual                  0.1304137 0.36113                                     
Number of obs: 19881, groups:  Subject, 38

Fixed effects:
                                Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)                    6.572e+00  4.152e-02  3.800e+01 158.301  < 2e-16 ***
Polarity1                      1.234e-01  1.124e-02  3.797e+01  10.985 2.38e-13 ***
Delay1                        -6.476e-02  4.512e-03  3.817e+01 -14.352  < 2e-16 ***
Truth_value1                   5.266e-02  8.034e-03  3.805e+01   6.556 9.83e-08 ***
Type1                          7.531e-03  2.562e-03  1.962e+04   2.939 0.003292 ** 
Color1                         2.512e-02  9.308e-03  3.756e+01   2.698 0.010379 *  
Order1                        -3.524e-02  8.981e-03  3.794e+01  -3.924 0.000354 ***
Polarity1:Delay1              -2.244e-02  3.433e-03  3.834e+01  -6.538 1.00e-07 ***
Polarity1:Truth_value1        -5.728e-02  2.563e-03  1.963e+04 -22.347  < 2e-16 ***
Polarity1:Order1              -1.250e-02  3.547e-03  3.823e+01  -3.525 0.001119 ** 
Polarity1:Type1               -7.107e-03  2.562e-03  1.962e+04  -2.774 0.005544 ** 
Polarity1:Color1               4.012e-03  4.114e-03  3.790e+01   0.975 0.335639    
Delay1:Truth_value1            5.301e-03  2.563e-03  1.963e+04   2.068 0.038629 *  
Polarity1:Delay1:Truth_value1  9.625e-03  2.563e-03  1.963e+04   3.755 0.000174 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Answer 1

（仅对答案进行排序；如果合适，稍后将删除。）

我无法复制您的问题，因为对于我当前正在使用的计算机，您的数据集太大；当我尝试运行stepcAIC(model_full, direction="backward")时，我得到：

无法计算初始模型的cAIC。

（由cAIC(model_full)的消息解释）：

错误：无法分配大小为2.9 Gb的向量

这也许不足为奇，因为该模型中等大小（约20K观测值，28个参数）。（深入研究代码，我们可以看到该模型正在尝试构建尺寸等于观察值数量的密集单位矩阵-在这种情况下，n * n * 8 bytes接近3 Gb .. ）

仅当您要基于单个级别预测选择模型时，才真正需要计算cAIC。如果您要根据人口水平预测进行选择，则AIC应该可以接受（并且计算便宜得多）。最简单的选择过程是基于p值的（我不喜欢它，因为我认为建模决策不应该基于显着性测试，但是很多人使用它）。

step()中的lmerTest函数将基于p值进行向后选择：

system.time(ss <- step(model_full,reduce.fixed=FALSE))

在我的旧笔记本电脑上大约需要4.5分钟。结果（缩写）是它测试了从随机效果中删除Truth_value，Polarity:Color和Polarity:Delay的效果，并得出结论，不应删除其中的任何一个。

Backward reduced random-effect table:

                     Eliminated npar  logLik   AIC     LRT Df Pr(>Chisq)    
<none>                            43 -8235.3 16557                          
T_i(1+P*C+D+D:P+T_|S          0   36 -8366.3 16804 261.915  7  < 2.2e-16 ***
P:Ci(1+P*C+D+D:P+T|S          0   36 -8257.1 16586  43.693  7  2.451e-07 ***
P:Di(1+P*C+D+D:P+T|S          0   36 -8245.0 16562  19.507  7   0.006739 ** 
---

?step.lmerModLmerTest

...“消除”列表示从模型中消除术语的顺序，其中零（“ 0”）表示没有从模型中消除术语。

在这种情况下，step()函数试图删除所有最高阶项（双向交互+ Truth_value的主效应，它不涉及交互），并且发现它不想删除任何一个。在这种情况下，p值标准（所有项的p <0.05）和AIC标准（所有简化模型的AIC都大于原始模型）是相互一致的。

stepcAIC-eval（predvars，data，env）中的错误：找不到对象'Color1'

1 个答案: