我希望有人可以帮助我。我正在尝试进行一项分析,该分析检查了在高程梯度下捕获的膜翅目样本的数量。我想研究与海拔有关的单模态分布的可能性,以及线性分布。因此,我将I(Altitude^2)
作为解释变量包括在分析中。
我正在尝试运行以下模型,其中包括Poisson错误结构(我们处理计数数据)和日期和陷阱类型(Trap
)作为随机效应。
model7 <- glmer(No.Specimens~Altitude+I(Altitude^2)+(1|Date)+(1|Trap),
family="poisson",data=Santa.Lucia,na.action=na.omit)
但是我一直收到以下错误消息:
Error: (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate
In addition: Warning messages:
1: Some predictor variables are on very different scales: consider rescaling
2: In pwrssUpdate(pp, resp, tolPwrss, GQmat, compDev, fac, verbose) :
Cholmod warning 'not positive definite' at file:../Cholesky/t_cholmod_rowfac.c, line 431
3: In pwrssUpdate(pp, resp, tolPwrss, GQmat, compDev, fac, verbose) :
Cholmod warning 'not positive definite' at file:../Cholesky/t_cholmod_rowfac.c, line 431
显然,我犯了一些重大错误。任何人都可以帮我弄清楚我哪里出错了吗?
以下是数据帧的结构:
str(Santa.Lucia)
'data.frame': 97 obs. of 6 variables:
$ Date : Factor w/ 8 levels "01-Sep-2014",..: 6 6 6 6 6 6 6 6 6 6 ...
$ Trap.No : Factor w/ 85 levels "N1","N10","N11",..: 23 48 51 14 17 20 24 27 30 33 ...
$ Altitude : int 1558 1635 1703 1771 1840 1929 1990 2047 2112 2193 ...
$ Trail : Factor w/ 3 levels "Cascadas","Limones",..: 1 1 1 1 1 3 3 3 3 3 ...
$ No.Specimens: int 1 0 2 2 3 4 5 0 1 1 ...
$ Trap : Factor w/ 2 levels "Net","Pan": 2 2 2 2 2 2 2 2 2 2 ...
这是完整的data.set(这些只是我的初步分析)
Date Trap.No Altitude Trail No.Specimens Trap
1 28-Aug-2014 W2 1558 Cascadas 1 Pan
2 28-Aug-2014 W5 1635 Cascadas 0 Pan
3 28-Aug-2014 W8 1703 Cascadas 2 Pan
4 28-Aug-2014 W11 1771 Cascadas 2 Pan
5 28-Aug-2014 W14 1840 Cascadas 3 Pan
6 28-Aug-2014 W17 1929 Tower 4 Pan
7 28-Aug-2014 W20 1990 Tower 5 Pan
8 28-Aug-2014 W23 2047 Tower 0 Pan
9 28-Aug-2014 W26 2112 Tower 1 Pan
10 28-Aug-2014 W29 2193 Tower 1 Pan
11 28-Aug-2014 W32 2255 Tower 0 Pan
12 30-Aug-2014 N1 1562 Cascadas 5 Net
13 30-Aug-2014 N2 1635 Cascadas 0 Net
14 30-Aug-2014 N3 1723 Cascadas 2 Net
15 30-Aug-2014 N4 1779 Cascadas 0 Net
16 30-Aug-2014 N5 1842 Cascadas 3 Net
17 30-Aug-2014 N6 1924 Tower 2 Net
18 30-Aug-2014 N7 1979 Tower 2 Net
19 30-Aug-2014 N8 2046 Tower 0 Net
20 30-Aug-2014 N9 2110 Tower 0 Net
21 30-Aug-2014 N10 2185 Tower 0 Net
22 30-Aug-2014 N11 2241 Tower 0 Net
23 31-Aug-2014 N1 1562 Cascadas 1 Net
24 31-Aug-2014 N2 1635 Cascadas 1 Net
25 31-Aug-2014 N3 1723 Cascadas 0 Net
26 31-Aug-2014 N4 1779 Cascadas 0 Net
27 31-Aug-2014 N5 1842 Cascadas 0 Net
28 31-Aug-2014 N6 1924 Tower 0 Net
29 31-Aug-2014 N7 1979 Tower 7 Net
30 31-Aug-2014 N8 2046 Tower 4 Net
31 31-Aug-2014 N9 2110 Tower 6 Net
32 31-Aug-2014 N10 2185 Tower 1 Net
33 31-Aug-2014 N11 2241 Tower 1 Net
34 01-Sep-2014 W1 1539 Cascadas 0 Pan
35 01-Sep-2014 W2 1558 Cascadas 0 Pan
36 01-Sep-2014 W3 1585 Cascadas 2 Pan
37 01-Sep-2014 W4 1604 Cascadas 0 Pan
38 01-Sep-2014 W5 1623 Cascadas 1 Pan
39 01-Sep-2014 W6 1666 Cascadas 4 Pan
40 01-Sep-2014 W7 1699 Cascadas 0 Pan
41 01-Sep-2014 W8 1703 Cascadas 0 Pan
42 01-Sep-2014 W9 1746 Cascadas 1 Pan
43 01-Sep-2014 W10 1762 Cascadas 0 Pan
44 01-Sep-2014 W11 1771 Cascadas 0 Pan
45 01-Sep-2014 W12 1796 Cascadas 1 Pan
46 01-Sep-2014 W13 1825 Cascadas 0 Pan
47 01-Sep-2014 W14 1840 Tower 4 Pan
48 01-Sep-2014 W15 1859 Tower 2 Pan
49 01-Sep-2014 W16 1889 Tower 2 Pan
50 01-Sep-2014 W17 1929 Tower 0 Pan
51 01-Sep-2014 W18 1956 Tower 0 Pan
52 01-Sep-2014 W19 1990 Tower 1 Pan
53 01-Sep-2014 W20 2002 Tower 3 Pan
54 01-Sep-2014 W21 2023 Tower 2 Pan
55 01-Sep-2014 W22 2047 Tower 0 Pan
56 01-Sep-2014 W23 2068 Tower 1 Pan
57 01-Sep-2014 W24 2084 Tower 0 Pan
58 01-Sep-2014 W25 2112 Tower 1 Pan
59 01-Sep-2014 W26 2136 Tower 0 Pan
60 01-Sep-2014 W27 2150 Tower 1 Pan
61 01-Sep-2014 W28 2193 Tower 1 Pan
62 01-Sep-2014 W29 2219 Tower 0 Pan
63 01-Sep-2014 W30 2227 Tower 1 Pan
64 01-Sep-2014 W31 2255 Tower 0 Pan
85 03/06/2015 WT47 1901 Tower 2 Pan
86 03/06/2015 WT48 1938 Tower 2 Pan
87 03/06/2015 WT49 1963 Tower 2 Pan
88 03/06/2015 WT50 1986 Tower 0 Pan
89 03/06/2015 WT51 2012 Tower 9 Pan
90 03/06/2015 WT52 2033 Tower 0 Pan
91 03/06/2015 WT53 2050 Tower 4 Pan
92 03/06/2015 WT54 2081 Tower 2 Pan
93 03/06/2015 WT55 2107 Tower 1 Pan
94 03/06/2015 WT56 2128 Tower 4 Pan
95 03/06/2015 WT57 2155 Tower 0 Pan
96 03/06/2015 WT58 2179 Tower 2 Pan
97 03/06/2015 WT59 2214 Tower 0 Pan
98 03/06/2015 WT60 2233 Tower 0 Pan
99 03/06/2015 WT61 2261 Tower 0 Pan
100 03/06/2015 WT62 2278 Tower 0 Pan
101 03/06/2015 WT63 2300 Tower 0 Pan
102 04/06/2015 WT31 1497 Cascadas 0 Pan
103 04/06/2015 WT32 1544 Cascadas 1 Pan
104 04/06/2015 WT33 1568 Cascadas 1 Pan
105 04/06/2015 WT34 1574 Cascadas 0 Pan
106 04/06/2015 WT35 1608 Cascadas 5 Pan
107 04/06/2015 WT36 1630 Cascadas 3 Pan
108 04/06/2015 WT37 1642 Cascadas 0 Pan
109 04/06/2015 WT38 1672 Cascadas 5 Pan
110 04/06/2015 WT39 1685 Cascadas 6 Pan
111 04/06/2015 WT40 1723 Cascadas 3 Pan
112 04/06/2015 WT41 1744 Cascadas 2 Pan
113 04/06/2015 WT42 1781 Cascadas 1 Pan
114 04/06/2015 WT43 1794 Cascadas 2 Pan
115 04/06/2015 WT44 1833 Cascadas 0 Pan
116 04/06/2015 WT45 1855 Cascadas 4 Pan
117 04/06/2015 WT46 1876 Cascadas 2 Pan
答案 0 :(得分:2)
问题几乎肯定是由于您将字符向量传递给data
参数:
..., data="Santa.Lucia, ..."
?glmer
表示data
参数应为:
data: an optional data frame containing the variables named in
‘formula’. By default the variables are taken from the
environment from which ‘lmer’ is called. While ‘data’ is
optional, the package authors _strongly_ recommend its use,
especially when later applying methods such as ‘update’ and
‘drop1’ to the fitted model (_such methods are not guaranteed
to work properly if ‘data’ is omitted_). If ‘data’ is
omitted, variables will be taken from the environment of
‘formula’ (if specified as a formula) or from the parent
frame (if specified as a character vector).
括号中的最后一部分,“如果指定为字符向量”涉及如果formula
的规范是字符向量而不指定{{1}作为一个角色。
更正您的电话以包含data
,您应该好好去。
答案 1 :(得分:1)
您已设法为Date使用两种不同的格式。这是一个修复:
Santa.Lucia$Date2 <- ifelse(nchar(as.character(Santa.Lucia$Date)) > 10,
as.Date(Santa.Lucia$Date, format="%d-%b-%Y"),
as.Date(Santa.Lucia$Date, format="%d/%m/%Y") )
我尝试了一个更简单的模型:
( model6 <-glmer(No.Specimens~Altitude+(1|Date2)+(1|Trap),family="poisson",data=Santa.Lucia,na.action=na.omit) )
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [
glmerMod]
Family: poisson ( log )
Formula: No.Specimens ~ Altitude + (1 | Date2) + (1 | Trap)
Data: Santa.Lucia
AIC BIC logLik deviance df.resid
368.6522 378.9510 -180.3261 360.6522 93
Random effects:
Groups Name Std.Dev.
Date2 (Intercept) 0.2248
Trap (Intercept) 0.0000
Number of obs: 97, groups: Date2, 6; Trap, 2
Fixed Effects:
(Intercept) Altitude
1.3696125 -0.0004992
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.0516296 (tol = 0.001, component 3)
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
- Rescale variables?
我实际上能够在没有错误或警告的情况下运行我建议的修改,但我认为使用这两个分组是不正确的,因为一个预测另一个:
> table(Santa.Lucia$Date2, Santa.Lucia$Trap)
Net Pan
16310 0 11
16312 11 0
16313 11 0
16314 0 31
16589 0 17
16590 0 16
这就是你不收敛的原因。这不是错误模型,而是设计和数据收集中的病理。我怀疑你是否真的有足够的数据来支持混合模型:
( model5 <-glm(No.Specimens~Altitude,family="poisson",data=Santa.Lucia,na.action=na.omit) )
Call: glm(formula = No.Specimens ~ Altitude, family = "poisson", data = Santa.Lucia,
na.action = na.omit)
Coefficients:
(Intercept) Altitude
1.4218234 -0.0005391
Degrees of Freedom: 96 Total (i.e. Null); 95 Residual
Null Deviance: 215.3
Residual Deviance: 213.2 AIC: 368.6
与二次高度模型进行比较:
( model5.2 <-glm(No.Specimens~poly(Altitude,2),family="poisson",data=Santa.Lucia,na.action=na.omit) )
Call: glm(formula = No.Specimens ~ poly(Altitude, 2), family = "poisson",
data = Santa.Lucia, na.action = na.omit)
Coefficients:
(Intercept) poly(Altitude, 2)1 poly(Altitude, 2)2
0.3188 -1.7116 -3.9539
Degrees of Freedom: 96 Total (i.e. Null); 94 Residual
Null Deviance: 215.3
Residual Deviance: 194.6 AIC: 352
> anova(model5.2)
Analysis of Deviance Table
Model: poisson, link: log
Response: No.Specimens
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev
NULL 96 215.31
poly(Altitude, 2) 2 20.698 94 194.61
> anova(model5.2, model5)
Analysis of Deviance Table
Model 1: No.Specimens ~ poly(Altitude, 2)
Model 2: No.Specimens ~ Altitude
Resid. Df Resid. Dev Df Deviance
1 94 194.61
2 95 213.20 -1 -18.59
答案 2 :(得分:1)
你几乎就在那里。正如@BondedDust所说,它不实用
使用两级因子(Trap
)作为随机效应;事实上,
它原则上似乎也不正确(Trap
的等级
不是随意/随意选择/可交换的)。当我尝试一个模型
具有二次高度,陷阱的固定效应和随机效应
Date
,我被警告说我可能想重新调整参数:
Some predictor variables are on very different scales: consider rescaling
(您看到此警告与您的错误消息混在一起)。唯一的连续(因此值得重新调整)预测器是Altitude
,所以我用scale()
集中并缩放它(唯一的缺点是它改变了系数的定量解释,但模型本身实际上是相同)。我还添加了观察级随机效应以允许过度离散。
结果似乎没问题,并同意图片。
library(lme4)
Santa.Lucia <- transform(Santa.Lucia,
scAlt=scale(Altitude),
obs=factor(seq(nrow(Santa.Lucia))))
model7 <- glmer(No.Specimens~scAlt+I(scAlt^2)+Trap+(1|Date)+(1|obs),
family="poisson",data=Santa.Lucia,na.action=na.omit)
summary(model7)
## Random effects:
## Groups Name Variance Std.Dev.
## obs (Intercept) 0.64712 0.8044
## Date (Intercept) 0.02029 0.1425
## Number of obs: 97, groups: obs, 97; Date, 6
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.53166 0.31556 1.685 0.09202 .
## scAlt -0.22867 0.14898 -1.535 0.12480
## I(scAlt^2) -0.52840 0.16355 -3.231 0.00123 **
## TrapPan -0.01853 0.32487 -0.057 0.95451
通过与缺乏它的模型进行比较来测试二次项...
model7R <- update(model7, . ~ . - I(scAlt^2))
## convergence warning, but probably OK ...
anova(model7,model7R)
原则上,可能值得研究二次高度模型和陷阱之间的相互作用(通过陷阱类型允许不同的高度趋势),但图片显示它不会做太多......
library(ggplot2); theme_set(theme_bw())
ggplot(Santa.Lucia,aes(Altitude,No.Specimens,colour=Trap))+
stat_sum(aes(size=factor(..n..)))+
scale_size_discrete(range=c(2,4))+
geom_line(aes(group=Date),colour="gray",alpha=0.3)+
geom_smooth(method="gam",family="quasipoisson",
formula=y~poly(x,2))+
geom_smooth(method="gam",family="quasipoisson",
formula=y~poly(x,2),se=FALSE,
aes(group=1),colour="black")