Question

我正在尝试使用Scikit训练2个名为x1和x2的功能。这两个数组都是形状(490,1)。为了将X个参数传递到clf.fit(X,y)，我使用np.concatenate生成了数组形状(490,2)。标签数组由1＆0和7组成，形状为(490,)。代码如下所示：

x1 = int_x # previously defined array shape (490,1)
x2 = int_x2 # previously defined array shape (490,1)
y=np.ravel(close) # where close is composed of 1's and 0's shape (490,1)
X,y = np.concatenate((x1[:-1],x2[:-1]),axis=1), y[:-1] #train on all datapoints except last 
clf = SVC()
clf.fit(X,y)

显示以下错误：

X.shape[1] = 1 should be equal to 2, the number of features at training time

我不明白为什么会出现这个消息，即使我检查X的形状时，它确实是2而不是1.我最初只用一个功能尝试了这个，clf.fit(X,y)效果很好所以我倾向于认为np.concatenate产生了一些不合适的东西。任何建议都会很棒。

Answer 1

如果没有int_x，int_x2和close的具体值，很难说。确实，如果我尝试int_x，int_x2和close随机构建为

import numpy as np
from sklearn.svm import SVC

int_x = np.random.normal(size=(490,1))
int_x2 = np.random.normal(size=(490,1))
close = np.random.randint(2, size=(490,))

符合您的规范，那么您的代码就可以运行。因此，错误可能与构造int_x，int_x2和close的方式有关。

如果您认为问题不存在，请与[{1}}，int_x和int_x2分享一个可重复性最小的示例吗？

Answer 2

I think I understand what was wrong with my code.

First, I should have created another variable, say x that defined the concatenation of int_x and int_x2 and is shape: (490,2), which is the same shape as close. This came in handy later.

Next, the clf.fit(X,y) was not incorrect in itself. However, I did not correctly formulate my prediction code. For instance, I said: clf.predict([close[-1]]) in hopes of capturing the binary target output (either 0 or 1). The argument that was passed into this method was incorrect. It should have been clf.predict([x[-1]]) because the algorithm predicts the label at the feature location as opposed to the other way around. Since the variable x is now the same shape as close, then the result of clf.predict([x[-1]]) should produce the predicted result of close[-1].

Scikit SVM错误：X.shape [1] = 1应该等于2

2 个答案: