我有一个多类数据集和分类器(使用Nearestcentroid
分类器,需要做一个OneVsRestClassifier
才能将其转换为二进制分类(用于绘制ROC曲线)。
到目前为止,我已经能够使其与SVM
,KNN
和bagging
一起使用。但是,当尝试将NearestCentroid
分类器与OneVsRestClassifer
一起使用时,似乎出现了错误,特别是因为分类器既没有predict_proba()
也没有decision_function()
函数。
这是我使用的代码的主要部分(对其他代码有用)
import numpy as np
import pandas as pd
from sklearn.neighbors.nearest_centroid import NearestCentroid
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
text_clf = Pipeline([('vect', CountVectorizer()),
('tfidf', TfidfTransformer()),
('clf', NearestCentroid()),
])
from sklearn.metrics import roc_curve, auc
from sklearn.multiclass import OneVsRestClassifier
from sklearn.preprocessing import label_binarize
categories = ['Business','Sci/Tech','World','Sports']
y_train = label_binarize(y_train, classes=categories)
y_test = label_binarize(y_test, classes=categories)
# classifier
new_clf = OneVsRestClassifier(text_clf)
another_new_clf = new_clf.fit(X_train, y_train)
y_score = another_new_clf.predict(X_test)
这是我得到的错误:
AttributeError Traceback (most recent call last)
~\Anaconda3\lib\site-packages\sklearn\multiclass.py in _predict_binary(estimator, X)
93 try:
---> 94 score = np.ravel(estimator.decision_function(X))
95 except (AttributeError, NotImplementedError):
~\Anaconda3\lib\site-packages\sklearn\utils\metaestimators.py in __get__(self, obj, type)
109 else:
--> 110 getattr(delegate, self.attribute_name)
111 break
AttributeError: 'NearestCentroid' object has no attribute 'decision_function'
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
<ipython-input-25-34dca44072e5> in <module>
2 new_clf = OneVsRestClassifier(text_clf)
3 another_new_clf = new_clf.fit(X_train, y_train)
----> 4 y_score = another_new_clf.predict(X_test)
5
6 # new_clf = OneVsRestClassifier(text_clf)
~\Anaconda3\lib\site-packages\sklearn\multiclass.py in predict(self, X)
331 indptr = array.array('i', [0])
332 for e in self.estimators_:
--> 333 indices.extend(np.where(_predict_binary(e, X) > thresh)[0])
334 indptr.append(len(indices))
335 data = np.ones(len(indices), dtype=int)
~\Anaconda3\lib\site-packages\sklearn\multiclass.py in _predict_binary(estimator, X)
95 except (AttributeError, NotImplementedError):
96 # probabilities of the positive class
---> 97 score = estimator.predict_proba(X)[:, 1]
98 return score
99
~\Anaconda3\lib\site-packages\sklearn\utils\metaestimators.py in __get__(self, obj, type)
108 continue
109 else:
--> 110 getattr(delegate, self.attribute_name)
111 break
112 else:
AttributeError: 'NearestCentroid' object has no attribute 'predict_proba'
任何帮助将不胜感激。预先感谢。