我正在尝试运行此机器学习平台,我收到以下错误:
ValueError: X.shape[1] = 574 should be equal to 11, the number of features at training time
我的代码:
from pylab import *
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
import numpy as np
X = list ()
Y = list ()
validationX = list ()
validationY = list ()
file = open ('C:\\Users\\User\\Desktop\\csci4113\\project1\\whitewineTraining.txt','r')
for eachline in file:
strArray = eachline.split(";")
row = list ()
for i in range(len(strArray) - 1):
row.append(float(strArray[i]))
X.append(row)
if (int(strArray[-1]) > 6):
Y.append(1)
else:
Y.append(0)
file2 = open ('C:\\Users\\User\\Desktop\\csci4113\\project1\\whitewineValidation.txt', 'r')
for eachline in file2:
strArray = eachline.split(";")
row2 = list ()
for i in range(len(strArray) - 1):
row2.append(float(strArray[i]))
validationX.append(row2)
if (int(strArray[-1]) > 6):
validationY.append(1)
else:
validationY.append(0)
X = np.array(X)
print (X)
Y = np.array(Y)
print (Y)
validationX = np.array(validationX)
validationY = np.array(validationY)
clf = svm.SVC()
clf.fit(X,Y)
result = clf.predict(validationX)
clf.score(result, validationY)
该程序的目标是从fit()命令构建模型,我们可以使用它来与validationY中的验证集进行比较,并查看我们的机器学习模型的有效性。以下是控制台输出的其余部分:请记住,X是一个令人困惑的11x574阵列!
[[ 7. 0.27 0.36 ..., 3. 0.45 8.8 ]
[ 6.3 0.3 0.34 ..., 3.3 0.49 9.5 ]
[ 8.1 0.28 0.4 ..., 3.26 0.44 10.1 ]
...,
[ 6.3 0.28 0.22 ..., 3. 0.33 10.6 ]
[ 7.4 0.16 0.33 ..., 3.04 0.68 10.5 ]
[ 8.4 0.27 0.3 ..., 2.89 0.3
11.46666667]]
[0 0 0 ..., 0 1 0]
C:\Users\User\Anaconda3\lib\site-packages\sklearn\utils\validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
DeprecationWarning)
Traceback (most recent call last):
File "<ipython-input-68-31c649fe24b3>", line 1, in <module>
runfile('C:/Users/User/Desktop/csci4113/project1/program1.py', wdir='C:/Users/User/Desktop/csci4113/project1')
File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
execfile(filename, namespace)
File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 89, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/User/Desktop/csci4113/project1/program1.py", line 43, in <module>
clf.score(result, validationY)
File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\base.py", line 310, in score
return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 568, in predict
y = super(BaseSVC, self).predict(X)
File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 305, in predict
X = self._validate_for_predict(X)
File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 474, in _validate_for_predict
(n_features, self.shape_fit_[1]))
ValueError: X.shape[1] = 574 should be equal to 11, the number of features at training time
runfile('C:/Users/User/Desktop/csci4113/project1/program1.py', wdir='C:/Users/User/Desktop/csci4113/project1')
10
[[ 7. 0.27 0.36 ..., 3. 0.45 8.8 ]
[ 6.3 0.3 0.34 ..., 3.3 0.49 9.5 ]
[ 8.1 0.28 0.4 ..., 3.26 0.44 10.1 ]
...,
[ 6.3 0.28 0.22 ..., 3. 0.33 10.6 ]
[ 7.4 0.16 0.33 ..., 3.04 0.68 10.5 ]
[ 8.4 0.27 0.3 ..., 2.89 0.3
11.46666667]]
[0 0 0 ..., 0 1 0]
C:\Users\User\Anaconda3\lib\site-packages\sklearn\utils\validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
DeprecationWarning)
Traceback (most recent call last):
File "<ipython-input-69-31c649fe24b3>", line 1, in <module>
runfile('C:/Users/User/Desktop/csci4113/project1/program1.py', wdir='C:/Users/User/Desktop/csci4113/project1')
File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
execfile(filename, namespace)
File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 89, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/User/Desktop/csci4113/project1/program1.py", line 46, in <module>
clf.score(result, validationY)
File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\base.py", line 310, in score
return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 568, in predict
y = super(BaseSVC, self).predict(X)
File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 305, in predict
X = self._validate_for_predict(X)
File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 474, in _validate_for_predict
(n_features, self.shape_fit_[1]))``
答案 0 :(得分:0)
您只是将错误对象传递给得分函数,documentation明确说明
得分(X,y,sample_weight =无)
X:类似数组,shape = (n_samples,n_features) 测试样本。
然后传递预测,因此
result = clf.predict(validationX)
clf.score(result, validationY)
无效,应该只是
clf.score(validationX, validationY)
如果你使用一些得分手而不是分类器,分类器.score方法自己调用.predict,你试图做的就没事了,所以你传递< strong>原始数据作为参数。