如何使用scikit

时间:2017-01-20 17:05:03

标签: python-3.x scikit-learn cross-validation train-test-split

您好我希望将训练/测试分割与交叉验证相结合,并在auc中获得结果。

我的第一个方法是我得到它但准确无误。

# split data into train+validation set and test set
X_trainval, X_test, y_trainval, y_test = train_test_split(dataset.data, dataset.target)
# split train+validation set into training and validation sets
X_train, X_valid, y_train, y_valid = train_test_split(X_trainval, y_trainval)
# train on classifier
clf.fit(X_train, y_train)
# evaluate the classifier on the test set
score = svm.score(X_valid, y_valid)
# combined training & validation set and evaluate it on the test set
clf.fit(X_trainval, y_trainval)
test_score = svm.score(X_test, y_test)

我没有找到如何申请roc_auc,请帮忙。

2 个答案:

答案 0 :(得分:0)

使用scikit-learn,您可以:

import numpy as np
from sklearn import metrics
y = np.array([1, 1, 2, 2])
scores = np.array([0.1, 0.4, 0.35, 0.8])
fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)

现在我们得到:

print(fpr)

数组([0,0.5,0.5,1。])

print(tpr)

数组([0.5,0.5,1。,1。])

print(thresholds)

阵列([0.8,0.4,0.35,0.1])

答案 1 :(得分:0)

在您的代码中,训练完分类器后,可通过以下方式获得预测:

y_preds = clf.predict(X_test)

然后使用它来计算auc值:

from sklearn.metrics import roc_curve, auc

fpr, tpr, thresholds = roc_curve(y, y_preds, pos_label=1)
auc_roc = auc(fpr, tpr)