我正在尝试使用Scikit训练2个名为x1和x2的功能。这两个数组都是形状(490,1)
。为了将X
个参数传递到clf.fit(X,y)
,我使用np.concatenate
生成了数组形状(490,2)
。标签数组由1&0和7组成,形状为(490,)
。代码如下所示:
x1 = int_x # previously defined array shape (490,1)
x2 = int_x2 # previously defined array shape (490,1)
y=np.ravel(close) # where close is composed of 1's and 0's shape (490,1)
X,y = np.concatenate((x1[:-1],x2[:-1]),axis=1), y[:-1] #train on all datapoints except last
clf = SVC()
clf.fit(X,y)
显示以下错误:
X.shape[1] = 1 should be equal to 2, the number of features at training time
我不明白为什么会出现这个消息,即使我检查X的形状时,它确实是2而不是1.我最初只用一个功能尝试了这个,clf.fit(X,y)
效果很好所以我倾向于认为np.concatenate
产生了一些不合适的东西。任何建议都会很棒。
答案 0 :(得分:0)
如果没有int_x
,int_x2
和close
的具体值,很难说。确实,如果我尝试int_x
,int_x2
和close
随机构建为
import numpy as np
from sklearn.svm import SVC
int_x = np.random.normal(size=(490,1))
int_x2 = np.random.normal(size=(490,1))
close = np.random.randint(2, size=(490,))
符合您的规范,那么您的代码就可以运行。因此,错误可能与构造int_x,int_x2和close的方式有关。
如果您认为问题不存在,请与[{1}},int_x
和int_x2
分享一个可重复性最小的示例吗?
答案 1 :(得分:0)
I think I understand what was wrong with my code.
First, I should have created another variable, say x
that defined the concatenation of int_x
and int_x2
and is shape: (490,2), which is the same shape as close
. This came in handy later.
Next, the clf.fit(X,y)
was not incorrect in itself. However, I did not correctly formulate my prediction code. For instance, I said: clf.predict([close[-1]])
in hopes of capturing the binary target output (either 0 or 1). The argument that was passed into this method was incorrect. It should have been clf.predict([x[-1]])
because the algorithm predicts the label at the feature location as opposed to the other way around. Since the variable x
is now the same shape as close
, then the result of clf.predict([x[-1]])
should produce the predicted result of close[-1]
.