用于精确度,AUC和召回率的Kfold命令运行良好,但现在显示错误。
内核多次重新启动,并尝试了其他无济于事的方法,例如“分层折叠”,“枚举”和循环。
from sklearn.model_selection import KFold
svc_clf = svm.SVC(C=50,
kernel='rbf',
gamma=0.1,
probability=False,
class_weight={1: 5}
)
svc_clf.fit(X_train_std, y_train)
# K-fold cross-validator
kfold = Kfold(n_splits=10, random_state=140311, shuffle=True)
for train_index, test_index in kfold.split(X):
X_training, X_testing = X_train_std[train_index], X_train_std[test_index]
y_training, y_testing = y_train[train_index], y_train[test_index]
df_kfold_acc = cross_val_score(svc_clf, X_train_std, y_train, cv=kfold, scoring='accuracy')
print'10 fold validation accuracy scores: \n', (df_kfold_acc)
print'Kfold mean accuracy score: \n', (df_kfold_acc).mean()
df_kfold_auc = cross_val_score(svc_clf, X_train_std, y_train, cv=kfold, scoring='roc_auc')
print'\n\n 10 fold validation AUC scores:\n ', (df_kfold_auc)
print'Kfold mean AUC score: \n', (df_kfold_auc).mean()
df_kfold_recall = cross_val_score(svc_clf, X_train_std, y_train, cv=kfold, scoring='recall')
print'\n\n 10 fold validation recall scores:\n', (df_kfold_recall)
print'Kfold mean recall score: \n', (df_kfold_recall).mean()
预期(和以前的类似)如下:
10折验证准确性得分:{0.7982993、0.6793838等(共10折)}} Kfold平均准确率分数:0.78679979
实际错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-56-61c6420c7a2f> in <module>()
6 class_weight={1: 5}
7 )
----> 8 svc_clf.fit(X_train_std, y_train)
9
10 # K-fold cross-validator
/Users/db/anaconda2/lib/python2.7/site-packages/sklearn/svm/base.pyc in fit(self, X, y, sample_weight)
147 self._sparse = sparse and not callable(self.kernel)
148
--> 149 X, y = check_X_y(X, y, dtype=np.float64, order='C', accept_sparse='csr')
150 y = self._validate_targets(y)
151
/Users/db/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.pyc in check_X_y(X, y, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
576 dtype=None)
577 else:
--> 578 y = column_or_1d(y, warn=True)
579 _assert_all_finite(y)
580 if y_numeric and y.dtype.kind == 'O':
/Users/db/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.pyc in column_or_1d(y, warn)
612 return np.ravel(y)
613
--> 614 raise ValueError("bad input shape {0}".format(shape))
615
616
ValueError: bad input shape (513, 10)
答案 0 :(得分:0)
该错误警告您正在传递用于训练的数据的形状不正确。 x_train:训练向量{类似数组,稀疏矩阵},形状(n_samples,n_features) y:相对于类似X数组形状(n_samples)的目标向量。
由于没有提供有关正在使用的数据的信息,因此X的(513,10)形状还可以,但是您应该检查目标矢量形状。应该是上面提到的形状。
from sklearn.model_selection import KFold,cross_val_score
from sklearn import svm
X_train = np.array([[1,1,1,1],[1,1,1,1],[0,0,0,0],[0,0,0,0]])
y_train = np.array([1,1,0,0])
svc_clf = svm.SVC(C=50,
kernel='rbf',
gamma=0.1,
probability=False,
class_weight={1: 5}
)
# K-fold cross-validator
kfold = KFold(n_splits=4, random_state=140311, shuffle=True)
df_kfold_acc = cross_val_score(svc_clf, X_train, y_train, cv=kfold, scoring='accuracy')
print('4 fold validation accuracy scores: \n', (df_kfold_acc))
print('Kfold mean accuracy score: \n', (df_kfold_acc).mean())
输出:
4 fold validation accuracy scores:
[1. 1. 1. 1.]
Kfold mean accuracy score:
1.0