如何用R?

时间:2018-03-13 19:16:48

标签: r logistic-regression

我正在尝试评估我构建的逻辑回归模型的拟合优度。最初,我建议使用Hosmer-Lemeshow测试,但经过进一步研究,我了解到它不如Hosmer et al所示的综合测试优度测试那样可靠。据我所知,R residual.lrm包中的rms是运行le Cessie的方法 - van Houwelingen - Copas - Hosmer未加权平方和测试。

我构建了以下模型:

> NEDOCModel <- glm(complication ~ ultrasound + fNEDOC, family = "binomial", data = modelmain);
> summary(NEDOCModel);

Call:
glm(formula = complication ~ ultrasound + fNEDOC, family = "binomial", 
data = modelmain, x = TRUE, y = TRUE)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.5841  -0.5812  -0.4899  -0.4899   2.0878  

Coefficients:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                   -1.69293    0.10126 -16.719   <2e-16 ***
ultrasound1                   -0.36661    0.12514  -2.929   0.0034 **
fNEDOCOvercrowded (140 - 200)  0.01087    0.13524   0.080   0.9359    

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 1765.6  on 2284  degrees of freedom
Residual deviance: 1757.1  on 2282  degrees of freedom

AIC: 1763.1

Number of Fisher Scoring iterations: 4

其中并发症是二元结果(0或1),超声和fNEDOC是二元预测因子(0或1)。

遵循residual.lrm函数的描述(和示例),我收到以下错误:

> resid(NEDOCModel, "gof");
Error in match.arg(type) : 

'arg' should be one of “deviance”, “pearson”, “working”, “response”, “partial”

作为业余爱好者和相对较新的领域,我将非常感谢您在解决此错误和指导方面提供的任何帮助,以确保我能够正确评估我的模型。

提前致谢!

编辑:这是一小部分数据:

simExample <- structure(list(complication = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0", 
"1"), class = "factor"), ultrasound = structure(c(1L, 2L, 2L, 
1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 
1L), .Label = c("0", "1"), class = "factor"), fNEDOC = structure(c(1L, 
2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 
1L, 1L, 2L), .Label = c("Not Overcrowded (00 - 140)", "Overcrowded (140 -
200)"), class = "factor")), .Names = c("complication", "ultrasound",
"fNEDOC"), row.names = c(NA, 20L), class = "data.frame")

View(simExample)
   complication ultrasound                     fNEDOC
1             0          0 Not Overcrowded (00 - 140)
2             0          1    Overcrowded (140 - 200)
3             0          1 Not Overcrowded (00 - 140)
4             0          0 Not Overcrowded (00 - 140)
5             0          1 Not Overcrowded (00 - 140)
6             0          0 Not Overcrowded (00 - 140)
7             0          0 Not Overcrowded (00 - 140)
8             0          1    Overcrowded (140 - 200)
9             0          1 Not Overcrowded (00 - 140)
10            1          0    Overcrowded (140 - 200)
11            0          0 Not Overcrowded (00 - 140)
12            0          1 Not Overcrowded (00 - 140)
13            0          1 Not Overcrowded (00 - 140)
14            0          1    Overcrowded (140 - 200)
15            0          1    Overcrowded (140 - 200)
16            0          1 Not Overcrowded (00 - 140)
17            0          1 Not Overcrowded (00 - 140)
18            0          0 Not Overcrowded (00 - 140)
19            0          1 Not Overcrowded (00 - 140)
20            0          0    Overcrowded (140 - 200)

1 个答案:

答案 0 :(得分:0)

如果结果是二项式或有序,则可以使用lrm。然而,检查“适合”的广义度量并不是我用来判断模型有效性的方法。我查看每个变量的残差,以考虑非线性的可能性,并评估更好拟合需要rcs样条函数的可能性。您的示例不够复杂,无法支持该方法的演示。并且......你说的是某些东西(你的问题中没有实际包含的代码)没有“起作用”(没有明确说明“工作”可能意味着什么。

library(rms)
NEDOCModel <- lrm(complication ~ ultrasound + fNEDOC, data = simExample, y=TRUE,x=TRUE);

 residuals(NEDOCModel)
#--------
 [1] -9.437191e-05 -1.746181e-04 -1.648351e-08 -9.437191e-05 -1.648351e-08 -9.437191e-05
 [7] -9.437191e-05 -1.746181e-04 -1.648351e-08  5.000005e-01 -9.437191e-05 -1.648351e-08
[13] -1.648351e-08 -1.746181e-04 -1.746181e-04 -1.648351e-08 -1.648351e-08 -9.437191e-05
[19] -1.648351e-08 -4.999995e-01
residuals(NEDOCModel,type="gof")
#-------
Sum of squared errors     Expected value|H0                    SD                     Z 
         0.5000001754          0.5012646602          0.0003628642         -3.4847333888 
                    P 
         0.0004926276 
但是,您应该学会使用互连函数的rms / Hmisc系统的其余功能。您尚未定义datadist,因此summary函数不适用于lrm - 分类对象

ddist <- datadist(simExample)
options(datadist='ddist')

summary(NEDOCModel)
#-----------
            Effects              Response : complication 

 Factor                                                       Low High Diff. Effect     S.E.  
 ultrasound - 0:1                                             2   1    NA        8.6527 37.864
  Odds Ratio                                                  2   1    NA     5725.8000     NA
 fNEDOC - Overcrowded (140 -\n200):Not Overcrowded (00 - 140) 1   2    NA        9.2682 42.045
  Odds Ratio                                                  1   2    NA    10595.0000     NA
 Lower 0.95  Upper 0.95
 -6.5559e+01 8.2865e+01
  3.3731e-29 9.7195e+35
 -7.3139e+01 9.1676e+01
  1.7218e-32 6.5198e+39