我正在尝试评估我构建的逻辑回归模型的拟合优度。最初,我建议使用Hosmer-Lemeshow测试,但经过进一步研究,我了解到它不如Hosmer et al所示的综合测试优度测试那样可靠。据我所知,R residual.lrm
包中的rms
是运行le Cessie的方法 - van Houwelingen - Copas - Hosmer未加权平方和测试。
我构建了以下模型:
> NEDOCModel <- glm(complication ~ ultrasound + fNEDOC, family = "binomial", data = modelmain);
> summary(NEDOCModel);
Call:
glm(formula = complication ~ ultrasound + fNEDOC, family = "binomial",
data = modelmain, x = TRUE, y = TRUE)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5841 -0.5812 -0.4899 -0.4899 2.0878
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.69293 0.10126 -16.719 <2e-16 ***
ultrasound1 -0.36661 0.12514 -2.929 0.0034 **
fNEDOCOvercrowded (140 - 200) 0.01087 0.13524 0.080 0.9359
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1765.6 on 2284 degrees of freedom
Residual deviance: 1757.1 on 2282 degrees of freedom
AIC: 1763.1
Number of Fisher Scoring iterations: 4
其中并发症是二元结果(0或1),超声和fNEDOC是二元预测因子(0或1)。
遵循residual.lrm
函数的描述(和示例),我收到以下错误:
> resid(NEDOCModel, "gof");
Error in match.arg(type) :
'arg' should be one of “deviance”, “pearson”, “working”, “response”, “partial”
作为业余爱好者和相对较新的领域,我将非常感谢您在解决此错误和指导方面提供的任何帮助,以确保我能够正确评估我的模型。
提前致谢!
编辑:这是一小部分数据:
simExample <- structure(list(complication = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0",
"1"), class = "factor"), ultrasound = structure(c(1L, 2L, 2L,
1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L,
1L), .Label = c("0", "1"), class = "factor"), fNEDOC = structure(c(1L,
2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L,
1L, 1L, 2L), .Label = c("Not Overcrowded (00 - 140)", "Overcrowded (140 -
200)"), class = "factor")), .Names = c("complication", "ultrasound",
"fNEDOC"), row.names = c(NA, 20L), class = "data.frame")
View(simExample)
complication ultrasound fNEDOC
1 0 0 Not Overcrowded (00 - 140)
2 0 1 Overcrowded (140 - 200)
3 0 1 Not Overcrowded (00 - 140)
4 0 0 Not Overcrowded (00 - 140)
5 0 1 Not Overcrowded (00 - 140)
6 0 0 Not Overcrowded (00 - 140)
7 0 0 Not Overcrowded (00 - 140)
8 0 1 Overcrowded (140 - 200)
9 0 1 Not Overcrowded (00 - 140)
10 1 0 Overcrowded (140 - 200)
11 0 0 Not Overcrowded (00 - 140)
12 0 1 Not Overcrowded (00 - 140)
13 0 1 Not Overcrowded (00 - 140)
14 0 1 Overcrowded (140 - 200)
15 0 1 Overcrowded (140 - 200)
16 0 1 Not Overcrowded (00 - 140)
17 0 1 Not Overcrowded (00 - 140)
18 0 0 Not Overcrowded (00 - 140)
19 0 1 Not Overcrowded (00 - 140)
20 0 0 Overcrowded (140 - 200)
答案 0 :(得分:0)
如果结果是二项式或有序,则可以使用lrm
。然而,检查“适合”的广义度量并不是我用来判断模型有效性的方法。我查看每个变量的残差,以考虑非线性的可能性,并评估更好拟合需要rcs
样条函数的可能性。您的示例不够复杂,无法支持该方法的演示。并且......你说的是某些东西(你的问题中没有实际包含的代码)没有“起作用”(没有明确说明“工作”可能意味着什么。
library(rms)
NEDOCModel <- lrm(complication ~ ultrasound + fNEDOC, data = simExample, y=TRUE,x=TRUE);
residuals(NEDOCModel)
#--------
[1] -9.437191e-05 -1.746181e-04 -1.648351e-08 -9.437191e-05 -1.648351e-08 -9.437191e-05
[7] -9.437191e-05 -1.746181e-04 -1.648351e-08 5.000005e-01 -9.437191e-05 -1.648351e-08
[13] -1.648351e-08 -1.746181e-04 -1.746181e-04 -1.648351e-08 -1.648351e-08 -9.437191e-05
[19] -1.648351e-08 -4.999995e-01
residuals(NEDOCModel,type="gof")
#-------
Sum of squared errors Expected value|H0 SD Z
0.5000001754 0.5012646602 0.0003628642 -3.4847333888
P
0.0004926276
但是,您应该学会使用互连函数的rms / Hmisc系统的其余功能。您尚未定义datadist
,因此summary
函数不适用于lrm
- 分类对象
ddist <- datadist(simExample)
options(datadist='ddist')
summary(NEDOCModel)
#-----------
Effects Response : complication
Factor Low High Diff. Effect S.E.
ultrasound - 0:1 2 1 NA 8.6527 37.864
Odds Ratio 2 1 NA 5725.8000 NA
fNEDOC - Overcrowded (140 -\n200):Not Overcrowded (00 - 140) 1 2 NA 9.2682 42.045
Odds Ratio 1 2 NA 10595.0000 NA
Lower 0.95 Upper 0.95
-6.5559e+01 8.2865e+01
3.3731e-29 9.7195e+35
-7.3139e+01 9.1676e+01
1.7218e-32 6.5198e+39