我试图理解分段混合效果模型的摘要输出,并且可以使用一些见解。具体来说,我想知道如何获得断点左右两侧的回归截距和斜率。根据我的理解,下面输出中给出的截距是断点左边的回归线,给出的值I(Days *(Days< 6.07))是该线的斜率。但是,我认为我(Days *(Days> = 6.07))不是断点右边的斜率,也不是两个斜率的差异。
library(lme4)
sleepstudy<-as.data.frame(sleepstudy)
我从前一个帖子中提取了断点:https://stats.stackexchange.com/questions/19772/estimating-the-break-point-in-a-broken-stick-piecewise-linear-model-with-rando
Linear mixed model fit by REML ['lmerMod']
Formula: Reaction ~ I(Days * (Days < 6.07)) + I(Days * (Days >= 6.07)) + (1 | Subject)
Data: sleepstudy
REML criterion at convergence: 1784.369
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 1377.6 37.12
Residual 965.7 31.08
Number of obs: 180, groups: Subject, 18
Fixed effects:
Estimate Std. Error t value
(Intercept) 252.2663 10.0545 25.090
I(Days * (Days < 6.07)) 10.0754 1.3774 7.315
I(Days * (Days >= 6.07)) 10.4513 0.8077 12.940
Correlation of Fixed Effects:
(Intr) I(*(<6
I(D*(D<6.07 -0.409
I(D*(D>=6.0 -0.374 0.630
我尝试通过删除随机效果来简化: 当I()包含在lm模型中时,斜率/截距与上面的混合模型非常相似,我仍然感到困惑。
mod_lm&lt; -lm(反应~I(天*(天数&lt; 6.07))+ I(天*(天数> = 6.07)),数据= sleepstudy) 摘要(mod_lm)
Call:
lm(formula = Reaction ~ I(Days * (Days < 6.07)) + I(Days * (Days >=
6.07)), data = sleepstudy)
Residuals:
Min 1Q Median 3Q Max
-111.581 -27.632 1.614 26.994 141.443
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 252.266 7.629 33.066 < 2e-16 ***
I(Days * (Days < 6.07)) 10.075 2.121 4.751 4.17e-06 ***
I(Days * (Days >= 6.07)) 10.451 1.243 8.405 1.37e-14 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 47.84 on 177 degrees of freedom
Multiple R-squared: 0.2867, Adjusted R-squared: 0.2786
F-statistic: 35.57 on 2 and 177 DF, p-value: 1.037e-13
然而,当从lm公式中移除I()时,我理解输出,结果是有意义的。
mod_lm&lt; -lm(反应〜天*(天数<6.07)+天*(天数> = 6.07),数据= sleepstudy) 摘要(mod_lm)
Call:
lm(formula = Reaction ~ Days * (Days < 6.07) + Days * (Days >=
6.07), data = sleepstudy)
Residuals:
Min 1Q Median 3Q Max
-114.214 -27.833 0.603 27.254 141.693
Coefficients: (2 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 207.008 64.211 3.224 0.00151 **
Days 16.050 7.985 2.010 0.04595 *
Days < 6.07TRUE 45.908 64.671 0.710 0.47872
Days >= 6.07TRUE NA NA NA NA
Days:Days < 6.07TRUE -6.125 8.265 -0.741 0.45965
Days:Days >= 6.07TRUE NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 47.91 on 176 degrees of freedom
Multiple R-squared: 0.2887, Adjusted R-squared: 0.2766
F-statistic: 23.81 on 3 and 176 DF, p-value: 5.526e-13
当从Imer公式中删除I()项时,lmer将不会运行。
mod1<-lmer(Reaction ~ Days*(Days < 6.07) + Days*(Days>= 6.07) + (1|Subject), data = sleepstudy)
Error in lme4::lFormula(formula = Reaction ~ Days * (Days < 6.07) + Days * :
rank of X = 4 < ncol(X) = 6
在模型预测变量上使用I()时,有人可以告诉我如何解释lmer()输出,还是告诉我如何在模型预测变量上运行没有I()的lmer()模型?
我感谢任何可用的指导,因为我无法在R帮助页面上找到任何指导!
谢谢。
答案 0 :(得分:5)
我认为你可以得到你想要的东西如下:
library(lme4)
sleepstudy <- transform(sleepstudy,period=(Days<6.5))
(m0 <- lmer(Reaction ~ Days+ (1 | Subject), sleepstudy))
(m2 <- lmer(Reaction ~ Days*period+ (1 | Subject), sleepstudy))
##
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days * period + (1 | Subject)
## Data: sleepstudy
## REML criterion at convergence: 1773.86
## Random effects:
## Groups Name Std.Dev.
## Subject (Intercept) 37.12
## Residual 31.06
## Number of obs: 180, groups: Subject, 18
## Fixed Effects:
## (Intercept) Days periodTRUE Days:periodTRUE
## 207.008 16.050 45.908 -6.125
I()
的结果是构造数字变量而不是分类变量(转换为虚拟变量)。也许你混淆的主要原因是你的第一组模型不允许按期间单独截取,只有单独的斜坡......
lmer
对第二组模型不起作用的原因是lmer
不像lm
那样容忍过度参数化(多线性预测因子),尽管开发版本(在Github上可用,很快就会发布)是:如果你运行mod1
它将适合模型并打印一条消息“固定效应模型矩阵排名不足,因此丢弃2列/系数”(不像{ {1}},它不保留带有lm
系数的已删除列,只需完全删除它们。
更新:
NA
有点难以看到斜坡的差异 - 非常微妙。
sleepstudy <- transform(sleepstudy,cDays=Days-6.5)
m3 <- lmer(Reaction ~ cDays:period+ (1 | Subject), sleepstudy)
library(ggplot2); theme_set(theme_bw())
library(reshape2)
g0 <- ggplot(sleepstudy,aes(Days,Reaction,group=Subject))+geom_line()
pframe <- data.frame(Days=seq(0,8,length=101))
pframe <- transform(pframe,cDays=Days-6.5,period=Days>6.5)
## next line assumes latest version of lme4 -- you may need REform instead
pframe$Reaction <- predict(m3,newdata=pframe,re.form=NA)
pframe$Reaction2 <- predict(m0,newdata=pframe,re.form=NA)