在这篇文章中,我正在寻求帮助,弄清楚如何确保我运行的负二项式glmm符合其模型假设。
我正在使用我在该领域收集的数据集,记录沿着28条溪流的休闲小径的存在,这些小溪周围有四种不同的土地利用类型,并且在7个不同的保护区(即国家公园)内和周围发现。我的初衷是使用线性混合模型来评估"休闲步道分数"记录每个站点和不同的土地利用类型,同时考虑到Park的随机效应。由于数据具有多个(真)0,因此无法转换数据以获得正态分布。因此,我尝试使用准泊松glmm。这个模型被过度分散了,所以我决定尝试使用负的生物模型,现在我正在寻求帮助来弄清楚这个模型是否符合它的假设。 (我怀疑该模型存在问题,因为结果似乎与我期望的数据分布一致)。 虽然我能够找到关于验证glmms的a post,但我仍然不清楚我究竟需要检查什么以及我应该如何检查以确保模型正常。
下面是R代码,可以让您重现我创建的模型。如果有人可以提供反馈: a)如果这个模型满足其假设,以及你究竟能够确认这一点 b)如果不是,你会建议下一步做什么?
#Create dataframe
RecreationalTrails<-c(5, 0, 0, 4, 7, 0, 0, 0, 6, 5, 0, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0,4, 6, 8, 0, 0, 7)
LandUse<-c("Protected", "Agricultural", "Forestry", "Unprotected Forest", "Protected", "Agricultural", "Forestry", "Unprotected Forest",
"Protected", "Agricultural", "Forestry", "Unprotected Forest","Protected", "Agricultural", "Forestry", "Unprotected Forest",
"Protected", "Agricultural", "Forestry", "Unprotected Forest","Protected", "Agricultural", "Forestry", "Unprotected Forest",
"Protected", "Agricultural", "Forestry", "Unprotected Forest")
Parc<-c("Monts Valin", "Monts Valin", "Monts Valin", "Monts Valin", "Fjords du Saguenay", "Fjords du Saguenay","Fjords du Saguenay",
"Fjords du Saguenay", "Hautes Gorges", "Hautes Gorges","Hautes Gorges","Hautes Gorges", "Grands Jardins", "Grands Jardins",
"Grands Jardins", "Grands Jardins", "Mont Tremblant", "Mont Tremblant", "Mont Tremblant", "Mont Tremblant", "Mauricie",
"Mauricie", "Mauricie", "Mauricie", "Jacques Cartier", "Jacques Cartier", "Jacques Cartier", "Jacques Cartier")
ESCombinedDataRE<-data.frame(c("LandUse", "Parc", "RecreationalTrails"))
ESCombinedDataRE <- data.frame(LandUse, Parc, RecreationalTrails)
names(ESCombinedDataRE) <- c("LandUse", "Parc", "RecreationalTrails")
ESCombinedDataRE
#Visualize the data
RecTrails_boxplot<-ggplot(ESCombinedDataRE, aes(x=LandUse, y=
RecreationalTrails, fill = LandUse))+
scale_fill_manual(values=c("#0072B2", "#56B4E9", "#009E73", "#F0E442"))+
geom_boxplot()+
theme_gray(base_size = 14)+
theme(legend.position="none",axis.text.x=element_text(size=18,angle=25,
hjust=1, vjust=1), axis.title.y = element_text(size = 13),axis.text.y =
element_text(size = 13),axis.title.x = element_text(size = 18))+
labs(x="",y="Recreational Trails\nScore")
RecTrails_boxplot
#Run glmer negative binomial model
install.packages("glmmADMB",
repos=c("http://glmmadmb.r-forge.r-project.org/repos",
getOption("repos")),
type="source")
library(glmmADMB)
R_glmer <- glmmadmb(RecreationalTrails ~ LandUse+ (1|Parc),
data=ESCombinedDataRE, family= "nbinom")
#Validate model
#Check overdispersion using this source code
https://rdrr.io/github/markushuff/PsychHelperFunctions/src/R/overdisp_fun.R
overdisp_fun(R_glmer)
#How else should I be checking that this model is meeting its assumptions?
#Model output and results
summary(R_glmer)
summary(glht(R_glmer, linfct = mcp(LandUse = "Tukey")))
#Use [this code][2] to look at conditional and marginal R2 values