我正在使用scikit学习0.15.2来解决多类别问题。在我开始使用MultiLabelBinarizer直到我开始使用MultiLabelBinarizer之前,我按照以下示例获得了大量的DeprecationWarnings:
"弃用警告:从版本0.17开始,将无法直接支持序列序列多标记表示。使用sklearn.preprocessing.MultiLabelBinarizer转换为标签指示符表示。"
但是,我无法找到一种方法来获取分类报告(精确,召回,f-measure),因为我之前可能如下所示:scikit 0.14 multi label metrics
我尝试使用inverse_transform,如下所示,这给出了一个classification_report,但也再次给出了警告,从0.17开始,这段代码就会中断。
如何获得多类别分类问题的衡量标准?
示例代码:
import numpy as np
from sklearn.multiclass import OneVsRestClassifier
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.svm import LinearSVC
from sklearn.metrics import classification_report
# Some simple data:
X_train = np.array([[0,0,0], [0,0,1], [0,1,0], [1,0,0], [1,1,1]])
y_train = [[1], [1], [1,2], [2], [2]]
# Use MultiLabelBinarizer and train a multi-class classifier:
mlb = MultiLabelBinarizer(sparse_output=True)
y_train_mlb = mlb.fit_transform(y_train)
clf = OneVsRestClassifier(LinearSVC())
clf.fit(X_train, y_train_mlb)
# classification_report, here I did not find a way to use y_train_mlb,
# I am getting a lot of DeprecationWarnings
predictions_test = mlb.inverse_transform(clf.predict(X_train))
print classification_report(y_train, predictions_test)
# Predict new example:
print mlb.inverse_transform(clf.predict(np.array([0,1,0])))
答案 0 :(得分:5)
您似乎必须使用二值化标签运行分类报告:
print classification_report(y_train_mlb, clf.predict(X_train))