在sklearn.svm.SVC(kernel ='rbf')分类器

时间:2016-09-25 02:43:04

标签: python machine-learning scikit-learn

在svm.SVC(kernel ='rbf')分类器上使用learning_curve时,我得到了奇怪的ValueError。

我正在使用:

from sklearn import svm
from sklearn import cross_validation, datasets, preprocessing
clf=svm.SVC(kernel='rbf')
cv=cross_validation.StratifiedKFold(y, n_folds=10)
for enum, (train, test) in enumerate(cv):
      print("Fold {0}, classes in train {1}, \t classes in test {2}".format(enum, set(y[train]), set(y[test])))
train_sizes, train_scores, test_scores = learning_curve(
    clf, X, y, cv=cv, n_jobs=n_jobs, train_sizes=train_sizes)

我可以看到,火车和测试装置都有两个类。

Fold 0, classes in train set([0, 1]),    classes in test set([0, 1])
Fold 1, classes in train set([0, 1]),    classes in test set([0, 1])
Fold 2, classes in train set([0, 1]),    classes in test set([0, 1])
Fold 3, classes in train set([0, 1]),    classes in test set([0, 1])
Fold 4, classes in train set([0, 1]),    classes in test set([0, 1])
Fold 5, classes in train set([0, 1]),    classes in test set([0, 1])
Fold 6, classes in train set([0, 1]),    classes in test set([0, 1])
Fold 7, classes in train set([0, 1]),    classes in test set([0, 1])
Fold 8, classes in train set([0, 1]),    classes in test set([0, 1])
Fold 9, classes in train set([0, 1]),    classes in test set([0, 1])

然后我得到以下错误:

ValueError: The number of classes has to be greater than one; got 1

有人可以帮忙找一个解决方法吗? 谢谢!

1 个答案:

答案 0 :(得分:1)

这看起来很可能是由learning_curve导致的,它会在不同大小的数据子样本上重新训练模型;默认情况下,样本大小为train_sizes=array([ 0.1, 0.33, 0.55, 0.78, 1. ]),根据您的数据,您可以通过省略较小的分数来解决问题,例如通过设置train_sizes=array([0.55, 0.78, 1. ]),您还应考虑减少折叠数量你的交叉验证。