我有两个文件:predictions.csv和target.csv。
predictions.csv的格式:
SampleID,Target
t1,-1.0454370703147253e-05
t2,-0.48161680725663214
t3,8.1420547483708091e-06
.
.
.
t4950,-6.4382307796971309e-05
target.csv的格式:
#SampleID,Target [0 or 1],Details [-1 or 4]
1,0,4
2,0,4
3,0,4
.
.
.
4950,0,4
我尝试了什么:
import numpy
from sklearn import metrics
target_file = "target.csv"
prediction_file = "predictions.csv"
true = numpy.genfromtxt(target_file,delimiter=',')
scores = numpy.genfromtxt(prediction_file, delimiter=',')
scores = scores[1:,1:]
true = true[:,2:]
fpr, tpr, thresholds = metrics.roc_curve(true, scores)
回溯:
Traceback (most recent call last):
File "<ipython-input-26-d4232bf9bd64>", line 1, in <module>
runfile('C:/Users/MyAccount/Documents/Spyder/Connectomics/myauc.py', wdir='C:/Users/MyAccount/Documents/Spyder/Connectomics')
File "C:\Users\MyAccount\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
execfile(filename, namespace)
File "C:\Users\MyAccount\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "C:/Users/MyAccount/Documents/Spyder/Connectomics/myauc.py", line 19, in <module>
fpr, tpr, thresholds = metrics.roc_curve(true, scores)
File "C:\Users\MyAccount\Anaconda\lib\site-packages\sklearn\metrics\ranking.py", line 477, in roc_curve
y_true, y_score, pos_label=pos_label, sample_weight=sample_weight)
File "C:\Users\MyAccount\Anaconda\lib\site-packages\sklearn\metrics\ranking.py", line 297, in _binary_clf_curve
raise ValueError("Data is not binary and pos_label is not specified")
我如何找到AUC?
编辑: 添加了target.csv列可以使用的可能值。