使用Scikit-learn训练SVM(支持向量机)分类器

时间:2017-08-15 08:06:51

标签: python machine-learning scikit-learn classification

我想使用WriteLine($"{hour}:{minout}:{seconds} { (hour > 12 ? "PM" : "AM") }"); 训练不同的分类器以及多标签分类问题的以下代码:

Scikit-learn

names = [ "Nearest Neighbors", "Linear SVM", "RBF SVM", "Gaussian Process", "Decision Tree", "Random Forest", "Neural Net", "AdaBoost", "Naive Bayes", "QDA"] classifiers = [ KNeighborsClassifier(3), SVC(C=0.025), SVC(gamma=2, C=1), GaussianProcessClassifier(1.0 * RBF(1.0)), DecisionTreeClassifier(max_depth=5), RandomForestClassifier(max_depth=5), MLPClassifier(alpha=0.5), AdaBoostClassifier(), GaussianNB(), QuadraticDiscriminantAnalysis()] for name, clf in izip(names, classifiers): clf.fit(X_train, Y_train) score = clf.score(X_train, Y_test) print name, score 分类器工作正常,但是当我到达SVM分类器时,它会抛出以下异常:

KNeighbors

原因是什么?如何解决这个问题?

修改:@Vivek评论后,分类器仅允许多标签分类

Traceback (most recent call last):
  File "/Users/mac/PycharmProjects/GraphLstm/classifier.py", line 87, in <module>
    clf.fit(X_train, Y_train)
  File "/Library/Python/2.7/site-packages/sklearn/svm/base.py", line 151, in fit
    X, y = check_X_y(X, y, dtype=np.float64, order='C', accept_sparse='csr')
  File "/Library/Python/2.7/site-packages/sklearn/utils/validation.py", line 526, in check_X_y
    y = column_or_1d(y, warn=True)
  File "/Library/Python/2.7/site-packages/sklearn/utils/validation.py", line 562, in column_or_1d
    raise ValueError("bad input shape {0}".format(shape))
ValueError: bad input shape (9280, 39)

2 个答案:

答案 0 :(得分:3)

knn classifier的拟合函数允许矩阵作为y值。对于svm,这是不允许的。错误消息尝试在不允许的y形状上提示您

答案 1 :(得分:1)

由于这是一个多标签分类问题,并非scikit中的所有估算器都能够固有地处理它们。 documentation提供了一个估算器列表,它支持开箱即用的多标签,如各种基于树的估算器或其他:

sklearn.tree.DecisionTreeClassifier
sklearn.tree.ExtraTreeClassifier
sklearn.ensemble.ExtraTreesClassifier
sklearn.neighbors.KNeighborsClassifier
...
...

然而,有一些策略,如 one-vs-all ,可用于训练所需的估算器(不直接支持multilabel)。 Sklearn估算器OneVsRestClassifier就是为此而制作的。

有关详细信息,请参阅documentation here