使用KNeighborsClassifier的SKlearn管道

时间:2017-03-11 08:51:34

标签: python scikit-learn pipeline grid-search

我正在尝试在sklearn中构建一个GridSearchCV管道,以便使用KNeighborsClassifier和SVM。到目前为止,已尝试过以下代码:

from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.neighbors import KNeighborsClassifier
neigh = KNeighborsClassifier(n_neighbors=3)
from sklearn import svm
from sklearn.svm import SVC
clf = SVC(kernel='linear')
pipeline = Pipeline([ ('knn',neigh), ('sVM', clf)]) # Code breaks here
weight_options = ['uniform','distance']
param_knn = {'weights':weight_options}
param_svc = {'kernel':('linear', 'rbf'), 'C':[1,5,10]}
grid = GridSearchCV(pipeline, param_knn, param_svc, cv=5, scoring='accuracy')

但是我收到以下错误:

TypeError: All intermediate steps should be transformers and implement fit and transform. 'KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=3, p=2,
           weights='uniform')' (type <class 'sklearn.neighbors.classification.KNeighborsClassifier'>) doesn't

任何人都可以帮助我解决我的错误,以及如何纠正它?我认为最后一行也有问题,重新参考。

2 个答案:

答案 0 :(得分:1)

错误清楚地表明KNeighborsClassifier没有变换方法KNN只有适合的方法,因为SVM有fit_transform()方法。对于Pipeline,我们可以将n个参数传递给它。但是所有参数都应该包含变换器方法。请参考以下链接

http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

答案 1 :(得分:0)

scikit-learn Pipeline步骤需要transform()方法。您可能希望尝试使用imblearn的管道。

例如,请参阅:https://bsolomon1124.github.io/oversamp/