我是混合效果和添加模型的新手,所以如果这里的答案很简单,我很抱歉。
我收集了几种代谢化学物质(M1,M2 ......),协变量(时间,种族,性别......)和疾病状态(D,D.binary)的数据。我正在尝试根据从GEE变量选择中选择的变量生成GAMM。
数据:
这是我现在的模型:
> b = gamm(D.binary ~ Time + s(M1) ,
random = list(ParticipantID = ~ 1 + Time), niterPQL=50,
data = NEC_data, family=binomial(link="logit"))
Maximum number of PQL iterations: 50
iteration 1
iteration 2
...
iteration 49
iteration 50
Warning message:
In gammPQL(y ~ X - 1, random = rand, data = strip.offset(mf), family = family, :
gamm not converged, try increasing niterPQL
> plot(b$gam,pages=1)
> summary(b$lme) # details of underlying lme fit
Linear mixed-effects model fit by maximum likelihood
Data: data
AIC BIC logLik
-160 -124 88
Random effects:
Formula: ~Xr - 1 | g
Structure: pdIdnot
Xr1 Xr2 Xr3 Xr4 Xr5 Xr6 Xr7 Xr8
StdDev: 0.812 0.812 0.812 0.812 0.812 0.812 0.812 0.812
Formula: ~1 + Time | ParticipantID %in% g
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 5.68324 (Intr)
Time 0.50739 -0.92
Residual 0.00691
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: list(fixed)
Value Std.Error DF t-value p-value
X(Intercept) -2.81 0.729 573 -3.86 0.0001
XTime -0.15 0.065 573 -2.30 0.0220
Xs(M1)Fx1 -1.60 0.066 573 -24.29 0.0000
Correlation:
X(Int) XTime
XTime -0.920
Xs(M1)Fx1 0.004 0.000
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.3472 -0.0692 -0.0117 0.0305 20.7271
Number of Observations: 636
Number of Groups:
g ParticipantID %in% g
1 61
> summary(b$gam) # gam style summary of fitted model
Family: binomial
Link function: logit
Formula:
NEC.binary ~ Time + s(M1)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.8135 0.7289 -3.86 0.00013 ***
Time -0.1495 0.0651 -2.30 0.02188 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(M1) 4.1 4.1 14913 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.0872
Scale est. = 4.7815e-05 n = 636
> anova(b$gam)
Family: binomial
Link function: logit
Formula:
NEC.binary ~ Time + s(M1)
Parametric Terms:
df F p-value
Time 1 5.28 0.022
Approximate significance of smooth terms:
edf Ref.df F p-value
s(M1) 4.1 4.1 14913 <2e-16
> gam.check(b$gam)
我怀疑我可能搞砸了一些相当基本的东西,因为M1是疾病状态最明显的鉴别者。它很重要(应该如此),但相关性非常低。而且,显然,模型没有收敛(即使我将迭代次数从20-> 50增加)。最后,检查图看起来非常离谱
我是否犯了基本语法错误?在我的模型中是否有一些恶意组件我在看?任何帮助将不胜感激。
我想在模型和2个协变量(Birthweight和Race)中添加另一种代谢物(M2)。当我将M2添加到模型时,我得到一个非收敛错误:
> b = gamm(D.binary ~ Time + s(M1) + s(M2) ,
random = list(ParticipantID = ~ 1 + Time), niterPQL=20, correlation = corLin(),
data = NEC_data, family=binomial(link="logit"))
Maximum number of PQL iterations: 20
iteration 1
iteration 2
Error in lme.formula(fixed = fixed, random = random, data = data, correlation = correlation, :
nlminb problem, convergence error code = 1
message = false convergence (8)
关于进入多维空间的任何建议也将受到赞赏。
我也尝试了这个模型的离散疾病分类(对照:0,1疾病:2,3)和泊松噪声。
> b = gamm(NEC ~ DPP + s(DSLNT_ug.mL) ,
+ random = list(ParticipantID = ~ 1 + DPP), niterPQL=20,
+ data = NEC_data, family=poisson)
Maximum number of PQL iterations: 20
iteration 1
iteration 2
...
iteration 19
iteration 20
Error in solve.default(pdMatrix(a, factor = TRUE)) :
system is computationally singular: reciprocal condition number = 3.13906e-19