Question

我有两个使用pROC包绘制ROC曲线的问题。

A。显着性水平或P值是发现ROC曲线下观察到的样本面积的概率，实际上，ROC曲线下的真实（人口）面积为0.5（零假设：面积= 0.5）。如果P很小（P <0.05），则可以得出结论，ROC曲线下的面积与0.5显着不同，因此有证据表明实验室测试确实具有区分两组的能力。

因此，我想计算ROC曲线下的某个区域是否显着不同于0.50。我发现使用pROC包的代码比较两条ROC曲线如下，但不确定如何测试它是否为0.5显着。

library(pROC)  
data(aSAH)    

rocobj1 <- plot.roc(aSAH$outcome, aSAH$s100,  
                    main="Statistical comparison", 
                    percent=TRUE, col="#1c61b6")  

rocobj2 <- lines.roc(aSAH$outcome, aSAH$ndka, 
                     percent=TRUE, col="#008600")  

testobj <- roc.test(rocobj1, rocobj2)  
text(50, 50, 
     labels=paste("p-value =", format.pval(testobj$p.value)), 
     adj=c(0, .5))  

legend("bottomright", legend=c("S100B", "NDKA"), 
       col=c("#1c61b6", "#008600"), lwd=2)

B. 我已经对我的分类问题进行了k次交叉验证。例如，5倍交叉验证将产生5条ROC曲线。那么如何使用pROC包绘制这5条ROC曲线的平均值（我想要做的是在这个网页上解释，但用Python完成：enter link description here）？另一件事是我们可以得到这个平均ROC曲线的置信区间和最佳阈值（类似下面实现的代码）吗？

    rocobj <- plot.roc(aSAH$outcome, aSAH$s100b,  
                       main="Confidence intervals", 
                       percent=TRUE,  ci=TRUE, # compute AUC (of AUC by default)  
                       print.auc=TRUE) # print the AUC (will contain the CI)  

    ciobj <- ci.se(rocobj, # CI of sensitivity  
                   specificities=seq(0, 100, 5)) # over a select set of specificities  
    plot(ciobj, type="shape", col="#1c61b6AA") # plot as a blue shape  
    plot(ci(rocobj, of="thresholds", thresholds="best")) # add one threshold

参考：

http://web.expasy.org/pROC/screenshots.html

http://scikit-learn.org/0.13/auto_examples/plot_roc_crossval.html

http://www.talkstats.com/showthread.php/14487-ROC-significance

http://www.medcalc.org/manual/roc-curves.php

Answer 1

一个。使用完全相同的wilcox.test。

B中。请参阅我对这个问题的回答：Feature selection + cross-validation, but how to make ROC-curves in R并简单地连接交叉验证的每个折叠中的数据（但是当你多次重复整个交叉验证时，或者当你多次重复整个交叉验证时，请不要这样做无法在运行之间比较预测。

ROC曲线图：0.50显着和交叉验证

1 个答案: