交叉验证不兼容的形状

时间:2016-08-05 05:56:34

标签: python scikit-learn cross-validation

我的csv数据如下:

0.03095566878715169,False
0.9700097239723956,False
0.9756176662740987,False
0.9516273399151274,False
0.21111951544035354,False
0.10371038060888567,False
0.018505911665029413,True
0.3595877911788813,True
0.010223522470333259,True
0.0812290660300292,True
0.19798744613629704,True

我正在尝试获得k倍交叉验证分数。

这是我的代码如下:

import os,csv
import numpy as np
from sklearn import cross_validation
from sklearn import datasets
from sklearn import svm

import numpy as np

csvout = open('xval.csv','wb')
csvwriter=csv.writer(csvout)

f='some.csv'
try:
    X,Y=[],[]
    feat=f[4:-4]
    print feat
    csvin = open(f,'rb')
    csvread=csv.reader(csvin)
    for row in csvread:
        X.append(row[0])
        Y.append(row[1])

    npX=np.array(X)
    npY=np.array(Y)
    clf = svm.SVC()
    xval_score=cross_validation.cross_val_score(clf,X=npX,y=npY,cv=10)
    csvwriter.writerow([feat,str(xval_score[-1])])
except Exception,e:
    print(e)    
csvout.close()

但是,我收到如下错误:

X and y have incompatible shapes.
X has 1 samples, but y has 837

或者我是以错误的方式来做这件事的?如果有人能够对此有所了解,我将不胜感激。

1 个答案:

答案 0 :(得分:0)

对于sklearn估算器X必须是二维数组。请尝试以下方法:

npX = np.array(X).reshape([-1, 1])