用预测的概率和预测带一目了然地绘制问题

时间:2020-07-24 01:21:30

标签: r logistic-regression lme4 mixed-models

我有一个可以通过glmer运行的模型,如下所示:

multi.sanctions.bust.full.ag <- glmer(allbuster ~ lageutradeshare100 + lagtradeopenP  + colonial  
                                    + lagtradesharePT + lnlaggdpp + lnlaggdpt  + duration + lndist + nobust + nobustsq + nobustcb + (1 | partnercode) + (1 | caseid),
                                      data=sanctions.data.new.scaled, family=binomial(link="logit"),
                              nAGQ=1,control=glmerControl(optimizer="nlminbwrap",optCtrl=list(maxfun=2e5)))

可以here访问数据。

我已经使用R中的predict命令弄清楚了如何使用下面的代码来获得预测的概率。据我所知,预测的概率是正确的:

tmpdat_intraeu <- multi.sanctions.bust.full.ag@frame[, c("caseid", "lndist", "lnlaggdpt", "duration",
"partnercode", "lnlaggdpp", "lagtradeopenP", 
"lageutradeshare100", "lagtradesharePT", "nobust", "nobustsq", 
"nobustcb", "colonial")]

jvalues_intraeu <- with(multi.sanctions.bust.full.ag, seq(from = 
min(multi.sanctions.bust.full.ag@frame[["lageutradeshare100"]]), 
to = max(multi.sanctions.bust.full.ag@frame[["lageutradeshare100"]]), 
length.out = 100))

pp_intraeu <- lapply(jvalues_intraeu, function(j) {
  tmpdat_intraeu$lageutradeshare100 <- j
  predict(multi.sanctions.bust.full.ag, newdata = tmpdat_intraeu, type = "response", re.form = NA)
})

# I don't think that the lines below this point are working for me; this is where I think the problem is:

plotdat_intraeu <- t(sapply(pp_intraeu, function(x) {
    c(M = mean(x), Med = median(x), quantile(x, c(0.25, 0.75), na.rm = TRUE), (mean(x)-(2*sd(x))),
      (mean(x)+(2*sd(x))))
}))

plotdat_intraeu <- as.data.frame(cbind(plotdat_intraeu, jvalues_intraeu))
colnames(plotdat_intraeu) <- c("PredictedProbabilityMean", "PredProbMedian", "quartile1", "quartile3", "lowersd", "uppersd", "lageutradeshare100")
head(plotdat_intraeu)
tail(plotdat_intraeu)

sb_intraeu <- ggplot() + geom_line(data=plotdat_intraeu, aes(x = lageutradeshare100, y = PredictedProbabilityMean), size = 2, color="blue") + 
  geom_ribbon(data=plotdat_intraeu, aes(x = lageutradeshare100, ymin = lowersd, ymax = uppersd),
              fill = "grey50", alpha=.5) +
  ylim(c(-.5, 1)) + 
  geom_hline(yintercept=0) +
    geom_rug(data=subset(multi.sanctions.bust.full.ag@frame,allbuster==0), aes(x=lageutradeshare100), color="black", size=1.0, sides="b", alpha= 3/4, length = unit(0.05, "npc")) +
    geom_rug(data=subset(multi.sanctions.bust.full.ag@frame,allbuster==1), aes(x=lageutradeshare100), color="red", size=1.0, sides="b", alpha = 1) +  
 theme(panel.grid.major = element_line(colour = "gray", linetype = "dotted"), panel.grid.minor = element_blank(), panel.background = element_blank(), axis.title.y = element_text(size=12, face="bold"), axis.title.x = element_text(size=12, face="bold")) +
  xlab("Intra-EU Trade Share") + 
  ylab("Predicted Probability of Sanctions Busting") 

sb_intraeu

我的问题是,得到的图形给了我这样的东西: enter image description here 一位教职员工在将我的论文送交我的委员会审查时告诉我,置信区间无法正确计算,而且范围太广。我同意评估,并且已经看到很难在这些类型的模型中置信区间,但是我不知所措,无法理解如何“固定”图形。我已经看到了使用predictInterval和`bootMER```进行引导的建议,但我一直无法弄清楚如何使它们工作。

任何帮助将不胜感激。我的论文基本上是书面的,但是直到我对关键IV的效果有了更好的可视化之后,我才能提交。

1 个答案:

答案 0 :(得分:0)

我最近一直在使用merTools软件包,该软件包可能会有所帮助。可以在here中找到有关如何使用该软件包的教程。

此软件包好的一件事是,它允许您指定要在其报告的置信区间中考虑的随机类型。我会在这里写更多有关此的内容,但是本教程比我能更好地解释了它,因此建议您阅读本教程的“不确定性”部分。