当我在lmerTest :: lmer上使用anova而不是lme4 :: lmer对象时,我得到的正方形和平均和的平方和高10倍。请参阅下面的R日志文件。注意警告当我附加lmerTest包时,stats :: sigma函数会覆盖lme4 :: sigma函数,我怀疑这是导致差异的原因。此外,anova报告现在说它是一个Type III anova而不是预期的Type I.这是lmerTest包中的一个错误,还是使用Kenward-Roger近似值来改变计算SumSQ和MSS以及我不理解的anova类型的规范?
我会附加测试文件,但它是保密的临床试验信息。如有必要,我可以看看我是否可以编写一个测试用例。
提前感谢您提供的任何建议。
> library(lme4)
Loading required package: Matrix
Attaching package: ‘lme4’
The following object is masked from ‘package:stats’:
sigma
> test100 <- lmer(log(value) ~ prepost * lowhi + (1|CID/LotNo/rep),
REML = F, data = GSIRlong, subset = !is.na(value))
> library(lmerTest)
Attaching package: ‘lmerTest’
The following object is masked from ‘package:lme4’:
lmer
The following object is masked from ‘package:stats’:
step
Warning message:
replacing previous import ‘lme4::sigma’ by ‘stats::sigma’ when loading
‘lmerTest’
> test200 <- lmer(log(value) ~ prepost * lowhi + (1|CID/LotNo/rep),
REML = F, data = GSIRlong, subset = !is.na(value))
> anova(test100)
Analysis of Variance Table
Df Sum Sq Mean Sq F value
prepost 1 3.956 3.956 18.4825
lowhi 1 130.647 130.647 610.3836
prepost:lowhi 1 0.038 0.038 0.1758
> anova(test200, ddf = 'Ken')
Analysis of Variance Table of type III with Kenward-Roger
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
prepost 37.15 37.15 1 308.04 18.68 2.094e-05 ***
lowhi 1207.97 1207.97 1 376.43 607.33 < 2.2e-16 ***
prepost:lowhi 0.35 0.35 1 376.43 0.17 0.676
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
更新:谢谢,本。我在lmerTest上做了一些代码考古,看看我是否可以对上述异常进行解释。首先,事实证明lmerTest :: lmer只是将模型提交给lme4 :: lmer,然后将结果重新标记为“mermodLmerTest”对象。这样做的唯一效果是从lmerTest包中调用summary()和anova()的版本,而不是从base和stats调用通常的默认值。 (这些lmerTest函数已编译,我还没有进一步查看C ++代码。)lmerTest :: summary只是在base :: summary结果中添加了三列,给出了df,t值和Pr。请注意,默认情况下,lmerTest :: anova计算类型III anova而不是stats :: anova中的类型I. (上面我的第二个问题的解释。)如果一个人的模型包含交互,那么这不是一个很好的选择。可以使用type = 1/2/3选项请求类型I,II或III anova。
然而,使用nlmeTest版本的summary和anova还有其他惊喜,如下面的R控制台文件所示。我使用了lmerTest包含的sleepstudy数据,因此这段代码应该是可复制的。
一个。请注意,“sleepstudy”有180条记录(包含3个变量)
湾除了添加的固定效果列之外,fm1和fm1a的摘要是相同的。但请注意,在lmerTest :: summary中,截距和天数的ddfs分别为1371和1281;奇怪的是,“睡眠研究”中只有180条记录。
℃。就像我上面的原始模型一样,anova的nlm4 anad nlmrTest版本给出了Sum Sq和Mean Sq的非常不同的值。 (分别为30031和446.65)。
d:有趣的是,使用Satterthwaite和Kenward-Rogers估计的DenDF的anlmrTest版anova版本差别很大(分别为5794080和28)。 K-R值似乎更合理。
鉴于上述问题,我现在不愿意依赖lmerTest来提供可靠的p值。基于您的(Doug Bates的)博客条目(https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html),我现在(并推荐)Dan Mirman(http://mindingthebrain.blogspot.ch/2014/02/three-ways-to-get-parameter-specific-p.html)在下面的代码的最后一点中发布的方法来估计天真的t检验p值(假设基本上是无限自由度)和Kenward-Rogers估计的df(使用R包'pbkrtest' - lmerTest使用的相同包)。我找不到R代码来计算Satterthwaite估计值。天真的t检验p值显然是反保守的,但KR估计被认为是相当不错的。如果两者给出类似的p估计值,那么我认为人们可以对[天真t检验,KR估计]范围内的p值感到舒服。
> library(lme4); library(lmerTest); library(pbkrtest);
dim(sleepstudy)
[1] 180 3
>
> fm1 <- lme4::lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
> fm1a <- lmerTest::lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
>
> summary(fm1)
Linear mixed model fit by REML ['lmerMod']
Formula: Reaction ~ Days + (Days | Subject)
Data: sleepstudy
REML criterion at convergence: 1743.6
Scaled residuals:
Min 1Q Median 3Q Max
-3.9536 -0.4634 0.0231 0.4634 5.1793
Random effects:
Groups Name Variance Std.Dev. Corr
Subject (Intercept) 612.09 24.740
Days 35.07 5.922 0.07
Residual 654.94 25.592
Number of obs: 180, groups: Subject, 18
Fixed effects:
Estimate Std. Error t value
(Intercept) 251.405 6.825 36.84
Days 10.467 1.546 6.77
Correlation of Fixed Effects:
(Intr)
Days -0.138
> summary(fm1a)
Linear mixed model fit by REML t-tests use Satterthwaite approximations to
degrees of freedom [lmerMod]
Formula: Reaction ~ Days + (Days | Subject)
Data: sleepstudy
REML criterion at convergence: 1743.6
Scaled residuals:
Min 1Q Median 3Q Max
-3.9536 -0.4634 0.0231 0.4634 5.1793
Random effects:
Groups Name Variance Std.Dev. Corr
Subject (Intercept) 612.09 24.740
Days 35.07 5.922 0.07
Residual 654.94 25.592
Number of obs: 180, groups: Subject, 18
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 251.405 6.825 1371.100 302.06 <2e-16 ***
Days 10.467 1.546 1281.700 55.52 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
Days -0.138
Warning message:
In deviance.merMod(object, ...) :
deviance() is deprecated for REML fits; use REMLcrit for the REML
criterion or deviance(.,REML=FALSE) for deviance calculated at the REML fit
>
> anova(fm1)
Analysis of Variance Table
Df Sum Sq Mean Sq F value
Days 1 30031 30031 45.853
> anova(fm1a, ddf = 'Sat', type = 1)
Analysis of Variance Table of type I with Satterthwaite
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
Days 446.65 446.65 1 5794080 45.853 1.275e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In deviance.merMod(object, ...) :
deviance() is deprecated for REML fits; use REMLcrit for the REML criterion or deviance(.,REML=FALSE) for deviance calculated at the REML fit
> anova(fm1a, ddf = 'Ken', type = 1)
Analysis of Variance Table of type I with Kenward-Roger
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
Days 446.65 446.65 1 27.997 45.853 2.359e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In deviance.merMod(object, ...) :
deviance() is deprecated for REML fits; use REMLcrit for the REML criterion or deviance(.,REML=FALSE) for deviance calculated at the REML fit
>
> # t.test
> coefs <- data.frame(coef(summary(fm1)))
> coefs$p.z <- 2 * (1 - pnorm(abs(coefs$t.value)))
> coefs
Estimate Std..Error t.value p.z
(Intercept) 251.40510 6.824556 36.838311 0.000000e+00
Days 10.46729 1.545789 6.771485 1.274669e-11
>
> # Kenward-Rogers
> df.KR <- get_ddf_Lb(fm1, fixef(fm1))
> df.KR
[1] 25.89366
> coefs$p.KR <- 2 * (1 - pt(abs(coefs$t.value), df.KR))
> coefs
Estimate Std..Error t.value p.z p.KR
(Intercept) 251.40510 6.824556 36.838311 0.000000e+00 0.0000e+00
Days 10.46729 1.545789 6.771485 1.274669e-11 3.5447e-07