使用lme4 :: lmer和lmerTest :: lmer

时间:2016-10-16 10:10:07

标签: lme4 lmertest

当我在lmerTest :: lmer上使用anova而不是lme4 :: lmer对象时,我得到的正方形和平均和的平方和高10倍。请参阅下面的R日志文件。注意警告当我附加lmerTest包时,stats :: sigma函数会覆盖lme4 :: sigma函数,我怀疑这是导致差异的原因。此外,anova报告现在说它是一个Type III anova而不是预期的Type I.这是lmerTest包中的一个错误,还是使用Kenward-Roger近似值来改变计算SumSQ和MSS以及我不理解的anova类型的规范?

我会附加测试文件,但它是保密的临床试验信息。如有必要,我可以看看我是否可以编写一个测试用例。

提前感谢您提供的任何建议。

> library(lme4)

Loading required package: Matrix

Attaching package: ‘lme4’

The following object is masked from ‘package:stats’:
sigma

> test100 <- lmer(log(value) ~ prepost * lowhi + (1|CID/LotNo/rep), 
REML = F, data = GSIRlong, subset = !is.na(value))

> library(lmerTest)
Attaching package: ‘lmerTest’
The following object is masked from ‘package:lme4’:
lmer
The following object is masked from ‘package:stats’:
step
Warning message:
replacing previous import ‘lme4::sigma’ by ‘stats::sigma’ when loading 
‘lmerTest’ 

> test200 <- lmer(log(value) ~ prepost * lowhi + (1|CID/LotNo/rep), 
REML = F, data = GSIRlong, subset = !is.na(value))

> anova(test100)
Analysis of Variance Table
              Df  Sum Sq Mean Sq  F value
prepost        1   3.956   3.956  18.4825
lowhi          1 130.647 130.647 610.3836
prepost:lowhi  1   0.038   0.038   0.1758

> anova(test200, ddf = 'Ken')
Analysis of Variance Table of type III  with  Kenward-Roger 
approximation for degrees of freedom

               Sum Sq Mean Sq NumDF  DenDF F.value    Pr(>F)    

prepost         37.15   37.15     1 308.04   18.68 2.094e-05 ***
lowhi         1207.97 1207.97     1 376.43  607.33 < 2.2e-16 ***
prepost:lowhi    0.35    0.35     1 376.43    0.17     0.676   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

更新:谢谢,本。我在lmerTest上做了一些代码考古,看看我是否可以对上述异常进行解释。首先,事实证明lmerTest :: lmer只是将模型提交给lme4 :: lmer,然后将结果重新标记为“mermodLmerTest”对象。这样做的唯一效果是从lmerTest包中调用summary()和anova()的版本,而不是从base和stats调用通常的默认值。 (这些lmerTest函数已编译,我还没有进一步查看C ++代码。)lmerTest :: summary只是在base :: summary结果中添加了三列,给出了df,t值和Pr。请注意,默认情况下,lmerTest :: anova计算类型III anova而不是stats :: anova中的类型I. (上面我的第二个问题的解释。)如果一个人的模型包含交互,那么这不是一个很好的选择。可以使用type = 1/2/3选项请求类型I,II或III anova。

然而,使用nlmeTest版本的summary和anova还有其他惊喜,如下面的R控制台文件所示。我使用了lmerTest包含的sleepstudy数据,因此这段代码应该是可复制的。

一个。请注意,“sleepstudy”有180条记录(包含3个变量)

湾除了添加的固定效果列之外,fm1和fm1a的摘要是相同的。但请注意,在lmerTest :: summary中,截距和天数的ddfs分别为1371和1281;奇怪的是,“睡眠研究”中只有180条记录。

℃。就像我上面的原始模型一样,anova的nlm4 anad nlmrTest版本给出了Sum Sq和Mean Sq的非常不同的值。 (分别为30031和446.65)。

d:有趣的是,使用Satterthwaite和Kenward-Rogers估计的DenDF的anlmrTest版anova版本差别很大(分别为5794080和28)。 K-R值似乎更合理。

鉴于上述问题,我现在不愿意依赖lmerTest来提供可靠的p值。基于您的(Doug Bates的)博客条目(https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html),我现在(并推荐)Dan Mirman(http://mindingthebrain.blogspot.ch/2014/02/three-ways-to-get-parameter-specific-p.html)在下面的代码的最后一点中发布的方法来估计天真的t检验p值(假设基本上是无限自由度)和Kenward-Rogers估计的df(使用R包'pbkrtest' - lmerTest使用的相同包)。我找不到R代码来计算Satterthwaite估计值。天真的t检验p值显然是反保守的,但KR估计被认为是相当不错的。如果两者给出类似的p估计值,那么我认为人们可以对[天真t检验,KR估计]范围内的p值感到舒服。

> library(lme4); library(lmerTest); library(pbkrtest); 
dim(sleepstudy)
[1] 180   3
> 
> fm1 <- lme4::lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
> fm1a <- lmerTest::lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
> 
> summary(fm1)
Linear mixed model fit by REML ['lmerMod']
Formula: Reaction ~ Days + (Days | Subject)
   Data: sleepstudy
REML criterion at convergence: 1743.6
Scaled residuals: 
Min      1Q  Median      3Q     Max 
-3.9536 -0.4634  0.0231  0.4634  5.1793 

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Subject  (Intercept) 612.09   24.740       
          Days         35.07    5.922   0.07
 Residual             654.94   25.592       
Number of obs: 180, groups:  Subject, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)  251.405      6.825   36.84
Days          10.467      1.546    6.77

Correlation of Fixed Effects:
     (Intr)
Days -0.138
> summary(fm1a)
Linear mixed model fit by REML t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: Reaction ~ Days + (Days | Subject)
   Data: sleepstudy

REML criterion at convergence: 1743.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9536 -0.4634  0.0231  0.4634  5.1793 

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Subject  (Intercept) 612.09   24.740       
          Days         35.07    5.922   0.07
 Residual             654.94   25.592       
Number of obs: 180, groups:  Subject, 18

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)  251.405      6.825 1371.100  302.06   <2e-16 ***
Days          10.467      1.546 1281.700   55.52   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
     (Intr)
Days -0.138
Warning message:
In deviance.merMod(object, ...) :
  deviance() is deprecated for REML fits; use REMLcrit for the REML
    criterion or deviance(.,REML=FALSE) for deviance calculated at the     REML fit
> 
> anova(fm1)
Analysis of Variance Table
     Df Sum Sq Mean Sq F value
Days  1  30031   30031  45.853

> anova(fm1a, ddf = 'Sat', type = 1)
Analysis of Variance Table of type I  with  Satterthwaite 
approximation for degrees of freedom
     Sum Sq Mean Sq NumDF   DenDF F.value    Pr(>F)    
Days 446.65  446.65     1 5794080  45.853 1.275e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In deviance.merMod(object, ...) :
  deviance() is deprecated for REML fits; use REMLcrit for the REML     criterion or deviance(.,REML=FALSE) for deviance calculated at the REML fit

> anova(fm1a, ddf = 'Ken', type = 1)
Analysis of Variance Table of type I  with  Kenward-Roger 
approximation for degrees of freedom
     Sum Sq Mean Sq NumDF  DenDF F.value    Pr(>F)    
Days 446.65  446.65     1 27.997  45.853 2.359e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In deviance.merMod(object, ...) :
  deviance() is deprecated for REML fits; use REMLcrit for the REML     criterion or deviance(.,REML=FALSE) for deviance calculated at the REML fit
> 
> #  t.test
> coefs <- data.frame(coef(summary(fm1)))
> coefs$p.z <- 2 * (1 - pnorm(abs(coefs$t.value)))
> coefs
             Estimate Std..Error   t.value          p.z
(Intercept) 251.40510   6.824556 36.838311 0.000000e+00
Days         10.46729   1.545789  6.771485 1.274669e-11
> 
> #  Kenward-Rogers
> df.KR <- get_ddf_Lb(fm1, fixef(fm1))
> df.KR
[1] 25.89366
> coefs$p.KR <- 2 * (1 - pt(abs(coefs$t.value), df.KR))
> coefs
             Estimate Std..Error   t.value          p.z       p.KR
(Intercept) 251.40510   6.824556 36.838311 0.000000e+00 0.0000e+00
Days         10.46729   1.545789  6.771485 1.274669e-11 3.5447e-07

0 个答案:

没有答案