Question

我对scikit-learn在python中如何生成阈值感到困惑。对于以下示例，将生成四个阈值，当我将pred中的第三个值更改为0.6时，阈值的数量将降至3。有人能解释为什么会这样吗？

#Example 1
import numpy as np
from sklearn import metrics
y = np.array([0, 0, 1, 1])
pred = np.array([0.1, 0.4, 0.3, 0.8])  #Please note the thord value here is `0.3`
fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=1)
fpr, tpr, thresholds 


(array([0. , 0.5, 0.5, 1. ]),
 array([0.5, 0.5, 1. , 1. ]),
 array([0.8, 0.4, 0.3, 0.1]))

#Example 2
y = np.array([0, 0, 1, 1])
pred = np.array([0.1, 0.4, 0.6, 0.8])
fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=1)
fpr, tpr, thresholds 

(array([0., 0., 1.]), 
array([0.5, 1. , 1. ]), 
array([0.8, 0.6, 0.1]))

Answer 1

有一个关键字参数drop_intermediate，默认为True：

drop_intermediate：布尔值，可选（默认值= True）是否降低一些不会出现在绘制的ROC曲线上的次优阈值。这对于创建更浅的ROC曲线很有用。 0.17版中的新功能：参数drop_intermediate。

因此将您的代码更改为：

fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=1, drop_intermediate=False)
fpr, tpr, thresholds

给予

(array([0. , 0. , 0.5, 1. ]),
 array([0.5, 1. , 1. , 1. ]),
 array([0.8, 0.6, 0.4, 0.1]))

您可以在documentation

中找到它

在计算AUC曲线时如何创建阈值？

1 个答案: