Question

我试图在我的数据上使用LinearSVC！我的代码如下：

from sklearn import svm

clf2 = svm.LinearSVC()
clf2.fit(X_train, y_train)

导致以下错误：

ValueError: bad input shape (2190, 9)

在分割成y_test和y_train之前，我已经对y值使用了单热编码，并认为这是个问题。我尝试过实施类似的修补程序（sklearn (Bad Input Shape) ValueError），但在尝试重新塑造时仍然会出错。

在一次热编码之后，我有一个目标变量（y），它有9个类，我总共有2190个样本在运行。我似乎需要将这9个类减少到1个类以适应SVM。

任何建议都将不胜感激！

Answer 1

LinearSVC不接受y的二维值。正如documented：

参数：

y : array-like, shape = [n_samples]

    Target vector relative to X

因此，您不需要转换为单热编码矩阵。只要按原样供应它们，即使它的字符串。它们将在内部正确处理。

Answer 2

您需要重塑阵列。下面是使用随机数据的示例，以及包含5个类的变量作为目标变量：

import numpy as np
from sklearn import svm

# 100 samples and 10 features
x = np.random.rand(100, 10) 

#5 classes
y = [1,2,3,4,5] * 20

x = np.asarray(x)
y = np.asarray(y)

print(x.shape)
print(y.shape)

clf2 = svm.LinearSVC()
clf2.fit(x, y)

结果：

(100, 10)

(100,)

LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='squared_hinge', max_iter=1000,
     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
     verbose=0)

Answer 3

根据document进行编码，您可以按以下方式尝试 sklearn.multiclass.OneVsRestClassifier ：

Scikitlearn LinearSVC输入形状不良

3 个答案: