skitlearn中的交叉验证和标准化

时间:2017-04-06 10:33:08

标签: python scikit-learn cross-validation

我想找到具有K-cross验证的sklearn分类器的准确性。我可以在没有交叉验证的情况下正常估计准确度。但是,如何改进此代码以进行交叉验证并同时应用StandardScaler?

from sklearn.datasets import load_iris
from sklearn.cross_validation import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn import metrics
from sklearn.cross_validation import cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn import svm
from sklearn.pipeline import Pipeline
iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=4)
pipe_lrSVC = Pipeline([('scaler', StandardScaler()), ('clf', svm.LinearSVC())])
pipe_lrSVC.fit(X_train, y_train)
y_pred = pipe_lrSVC.predict(X_test)
print(metrics.accuracy_score(y_test, y_pred))

1 个答案:

答案 0 :(得分:2)

只需使用管道作为cross_val_score的估算工具输入:

cross_val_score(pipe_lrSVC, iris.data, iris.target, cv=5)