无法获得SVC Score功能

时间:2016-09-27 22:22:08

标签: python numpy machine-learning

我正在尝试运行此机器学习平台,我收到以下错误:

ValueError: X.shape[1] = 574 should be equal to 11, the number of features at training time

我的代码:

from pylab import *
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
import numpy as np

X = list ()
Y = list ()
validationX = list ()
validationY = list ()
file = open ('C:\\Users\\User\\Desktop\\csci4113\\project1\\whitewineTraining.txt','r')
for eachline in file:
    strArray = eachline.split(";")
    row = list ()
    for i in range(len(strArray) - 1):
        row.append(float(strArray[i])) 
    X.append(row)
    if (int(strArray[-1]) > 6):
        Y.append(1)
    else:
        Y.append(0)
file2 = open ('C:\\Users\\User\\Desktop\\csci4113\\project1\\whitewineValidation.txt', 'r')
for eachline in file2:
    strArray = eachline.split(";")
    row2 = list ()
    for i in range(len(strArray) - 1):
        row2.append(float(strArray[i])) 
    validationX.append(row2)      
    if (int(strArray[-1]) > 6):
        validationY.append(1)
    else:
        validationY.append(0)

X = np.array(X)
print (X)
Y = np.array(Y)
print (Y)
validationX = np.array(validationX)
validationY = np.array(validationY)

clf = svm.SVC()
clf.fit(X,Y)
result = clf.predict(validationX)
clf.score(result, validationY)

该程序的目标是从fit()命令构建模型,我们可以使用它来与validationY中的验证集进行比较,并查看我们的机器学习模型的有效性。以下是控制台输出的其余部分:请记住,X是一个令人困惑的11x574阵列!

[[  7.           0.27         0.36       ...,   3.           0.45         8.8       ]
 [  6.3          0.3          0.34       ...,   3.3          0.49         9.5       ]
 [  8.1          0.28         0.4        ...,   3.26         0.44        10.1       ]
 ..., 
 [  6.3          0.28         0.22       ...,   3.           0.33        10.6       ]
 [  7.4          0.16         0.33       ...,   3.04         0.68        10.5       ]
 [  8.4          0.27         0.3        ...,   2.89         0.3
   11.46666667]]
[0 0 0 ..., 0 1 0]
C:\Users\User\Anaconda3\lib\site-packages\sklearn\utils\validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
Traceback (most recent call last):

  File "<ipython-input-68-31c649fe24b3>", line 1, in <module>
    runfile('C:/Users/User/Desktop/csci4113/project1/program1.py', wdir='C:/Users/User/Desktop/csci4113/project1')

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
    execfile(filename, namespace)

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 89, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/User/Desktop/csci4113/project1/program1.py", line 43, in <module>
    clf.score(result, validationY)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\base.py", line 310, in score
    return accuracy_score(y, self.predict(X), sample_weight=sample_weight)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 568, in predict
    y = super(BaseSVC, self).predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 305, in predict
    X = self._validate_for_predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 474, in _validate_for_predict
    (n_features, self.shape_fit_[1]))

ValueError: X.shape[1] = 574 should be equal to 11, the number of features at training time


runfile('C:/Users/User/Desktop/csci4113/project1/program1.py', wdir='C:/Users/User/Desktop/csci4113/project1')
10
[[  7.           0.27         0.36       ...,   3.           0.45         8.8       ]
 [  6.3          0.3          0.34       ...,   3.3          0.49         9.5       ]
 [  8.1          0.28         0.4        ...,   3.26         0.44        10.1       ]
 ..., 
 [  6.3          0.28         0.22       ...,   3.           0.33        10.6       ]
 [  7.4          0.16         0.33       ...,   3.04         0.68        10.5       ]
 [  8.4          0.27         0.3        ...,   2.89         0.3
   11.46666667]]
[0 0 0 ..., 0 1 0]
C:\Users\User\Anaconda3\lib\site-packages\sklearn\utils\validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
Traceback (most recent call last):

  File "<ipython-input-69-31c649fe24b3>", line 1, in <module>
    runfile('C:/Users/User/Desktop/csci4113/project1/program1.py', wdir='C:/Users/User/Desktop/csci4113/project1')

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
    execfile(filename, namespace)

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 89, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/User/Desktop/csci4113/project1/program1.py", line 46, in <module>
    clf.score(result, validationY)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\base.py", line 310, in score
    return accuracy_score(y, self.predict(X), sample_weight=sample_weight)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 568, in predict
    y = super(BaseSVC, self).predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 305, in predict
    X = self._validate_for_predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 474, in _validate_for_predict
    (n_features, self.shape_fit_[1]))``

1 个答案:

答案 0 :(得分:0)

您只是将错误对象传递给得分函数,documentation明确说明

  

得分(X,y,sample_weight =无)

     

X:类似数组,shape = (n_samples,n_features)   测试样本。

然后传递预测,因此

result = clf.predict(validationX)
clf.score(result, validationY)

无效,应该只是

clf.score(validationX, validationY)

如果你使用一些得分手而不是分类器,分类器.score方法自己调用.predict,你试图做的就没事了,所以你传递< strong>原始数据作为参数。