我已经看到其他问题解决了python scikit-learn的roc_curve函数可能返回的数值比数据点的数量少得多的问题,我知道当有少量唯一值时会发生这种情况在概率值。
以下是for循环的第一次迭代的输出如下所示:
y_test: [0. 1. 0. 1. 0. 1. 0. 1. 0. 0. 1. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 1. 0. 1.
1. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1.]
probas: [0.97980869 0.61031697 0.9463976 0.07607395 0.93956894 0.06914656
0.64741115 0.07618758 0.95895803 0.83249766 0.13942336 0.7326476
0.93728438 0.07894027 0.97504296 0.92879864 0.93744224 0.21646299
0.95141726 0.92728865 0.97493415 0.07854641 0.95159664 0.36212405
0.21415855 0.10376292 0.95303641 0.11629533 0.93807975 0.7540189
0.93019584 0.94054764 0.93755026 0.93893753 0.95637685 0.10910955
0.96091857 0.95273078 0.61031697 0.9745807 0.11621697 0.97879922
0.96512002 0.09424992]
代码:
for i in range(0, 2):
print("y_test: ", y_test[:, 1])
print("probas: ", probas_[:, i])
fpr[i], tpr[i], _ = roc_curve(y_test[:, 1], probas_[:, i], pos_label=1)
roc_auc[i] = auc(fpr[i], tpr[i])
print("fpr", fpr)
print("tpr", tpr)
print("roc", roc_auc)
结果:
fpr {0: array([0., 0., 0., 1.]), 1: array([0. , 0.03571429, 1. , 1. , 1. ]), 'micro': array([0. , 0.02272727, 0.95454545, 0.95454545, 1. ,
1. ])}
tpr {0: array([0.0625, 0.875 , 1. , 1. ]), 1: array([0. , 0. , 0. , 0.125, 1. ]), 'micro': array([0. , 0. , 0. , 0.04545455, 0.04545455,
1. ])}
roc {0: 1.0, 1: 0.0, 'micro': 0.002066115702479337}
FPR和TPR都有四点!为什么会这样?
谢谢!