Question

我以前曾在这里寻求过帮助，我希望能再次获得一些帮助。

我估计了一个相当大的混合效应模型，其中一个随机效应有超过150个不同的水平。这将使标准的毛毛虫情节变得非常难以理解。

我希望，如果可能的话，得到一个只有随机效应水平的毛虫情节，因为缺乏更好的术语，＆＃34;重要的＆＃34;。那就是：我想要一个毛虫图，其中随机截距或变化系数的随机斜率具有＆＃34;置信区间＆＃34; （我知道它并不完全是什么）不包括零。

从sleepstudy标准的lme4数据中考虑此标准模型。

library(lme4)
fit <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
ggCaterpillar(ranef(fit,condVar=TRUE), QQ=FALSE, likeDotplot=TRUE, reorder=FALSE)[["Subject"]]

我会得到这个毛毛虫情节。

a caterpillar plot

我使用的毛虫图来自this code。注意我倾向于使用较不保守的界限（即1.645 * se而不是1.96 * se）。

基本上，我想要一个只包含308,309,310,330,331,335,337,349,350,352和370等级的毛虫图，因为这些等级截取或< / strong>斜率，其间隔不包括零。我问，因为我的150多个不同级别的毛虫情节是不可读的，我认为这可能是一个值得解决的问题。

以下是可重现的代码。我真的很感激任何帮助。

# https://stackoverflow.com/questions/34120578/how-can-i-sort-random-effects-by-value-of-the-random-effect-not-the-intercept ggCaterpillar <- function(re, QQ=TRUE, likeDotplot=TRUE, reorder=TRUE) { require(ggplot2) f <- function(x) { pv <- attr(x, "postVar") cols <- 1:(dim(pv)[1]) se <- unlist(lapply(cols, function(i) sqrt(pv[i, i, ]))) if (reorder) { ord <- unlist(lapply(x, order)) + rep((0:(ncol(x) - 1)) * nrow(x), each=nrow(x)) pDf <- data.frame(y=unlist(x)[ord], ci=1.645*se[ord], nQQ=rep(qnorm(ppoints(nrow(x))), ncol(x)), ID=factor(rep(rownames(x), ncol(x))[ord], levels=rownames(x)[ord]), ind=gl(ncol(x), nrow(x), labels=names(x))) } else { pDf <- data.frame(y=unlist(x), ci=1.645*se, nQQ=rep(qnorm(ppoints(nrow(x))), ncol(x)), ID=factor(rep(rownames(x), ncol(x)), levels=rownames(x)), ind=gl(ncol(x), nrow(x), labels=names(x))) } if(QQ) { ## normal QQ-plot p <- ggplot(pDf, aes(nQQ, y)) p <- p + facet_wrap(~ ind, scales="free") p <- p + xlab("Standard normal quantiles") + ylab("Random effect quantiles") } else { ## caterpillar dotplot p <- ggplot(pDf, aes(ID, y)) + coord_flip() if(likeDotplot) { ## imitate dotplot() -> same scales for random effects p <- p + facet_wrap(~ ind) } else { ## different scales for random effects p <- p + facet_grid(ind ~ ., scales="free_y") } p <- p + xlab("Levels of the Random Effect") + ylab("Random Effect") } p <- p + theme(legend.position="none") p <- p + geom_hline(yintercept=0) p <- p + geom_errorbar(aes(ymin=y-ci, ymax=y+ci), width=0, colour="black") p <- p + geom_point(aes(size=1.2), colour="blue") return(p) } lapply(re, f) } library(lme4) fit <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy) ggCaterpillar(ranef(fit,condVar=TRUE), QQ=FALSE, likeDotplot=TRUE, reorder=FALSE)[["Subject"]] ggsave(file="sleepstudy.png")

Answer 1

首先，感谢您在引号中加上“重要”......每个阅读此内容的人都应该记住，在这种情况下，重要性没有任何统计含义（使用Z可能更好-statistic（value / std.error）标准，例如| Z |＆gt; 1.5或| Z |＆gt; 1.75而不是强调这是不推理阈值...）

我最终得到了一点点......我决定稍微重构/模块化一些东西会更好，所以我写了一个augment方法（旨在与{{1}一起使用从broom个对象构造有用的数据框......一旦完成，你想要的操作就很容易了。

我将ranef.mer代码放在我的答案的末尾 - 它有点长（你需要在此处运行代码之前获取它）。

augment.ranef.mer

将library(broom) library(reshape2) library(plyr)方法应用于RE对象：

augment

现在rr <- ranef(fit,condVar=TRUE) aa <- augment(rr) names(aa) ## [1] "grp" "variable" "level" "estimate" "qq" "std.error" ## [7] "p" "lb" "ub"代码非常基本。我使用ggplot而不是geom_errorbarh(height=0)，因为geom_pointrange()+coord_flip()无法将ggplot2与coord_flip一起使用...

facet_wrap(...,scales="free")

现在找到你想要保留的等级：

## Q-Q plot:
g0 <- ggplot(aa,aes(estimate,qq,xmin=lb,xmax=ub))+
    geom_errorbarh(height=0)+
    geom_point()+facet_wrap(~variable,scale="free_x")

## regular caterpillar plot:
g1 <- ggplot(aa,aes(estimate,level,xmin=lb,xmax=ub))+
    geom_errorbarh(height=0)+
    geom_vline(xintercept=0,lty=2)+
    geom_point()+facet_wrap(~variable,scale="free_x")

更新仅具有“显着”斜率或截距的水平的毛虫图：

aa2 <- ddply(aa,c("grp","level"),
             transform,
             keep=any(p<0.05))
aa3 <- subset(aa2,keep)

如果您只想强调“重要”级别而不是完全删除“非重要”级别

g1 %+% aa3

ggplot(aa2,aes(estimate,level,xmin=lb,xmax=ub,colour=factor(keep)))+
    geom_errorbarh(height=0)+
    geom_vline(xintercept=0,lty=2)+
    geom_point()+facet_wrap(~variable,scale="free_x")+
    scale_colour_manual(values=c("black","red"),guide=FALSE)

只有＆＃34;重要＆＃34;的毛虫情节。混合效应模型的随机效应

1 个答案: