Python ValueError但形状匹配

时间:2019-10-08 02:39:05

标签: python scikit-learn

我遇到值错误,但输入变量的形状看起来像它们匹配。这是错误:

ValueError: Found input variables with inconsistent numbers of samples: [644170, 14]

这是我的代码:

# 10-K Folds
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

kfold = KFold(n_splits=10, random_state=1)
results = cross_val_score(estimator = grid.best_estimator_, X = X, y = y, cv = kfold, scoring = 'f1_macro') # https://scikit-learn.org/0.17/modules/generated/sklearn.cross_validation.cross_val_score.html
results # Array of scores of the estimator for each run of the cross validation.

以下是形状:

X.shape
(644170, 14)

y.shape
(14,)

两种形状都有14个。

1 个答案:

答案 0 :(得分:2)

错误似乎在这里:

X.shape
# (644170, 14)

y.shape
# (14)

您在训练集中有644170个观测值(具有14个特征),作为目标,您只有14个值...您应该有644170个目标值才能进行交叉验证。

要弄清主意,请看这个经典示例,该示例基于您在sklearn documentation上找到的虹膜数据集:

from sklearn import datasets, linear_model
from sklearn.model_selection import cross_val_score
diabetes = datasets.load_diabetes()
X = diabetes.data[:150]
y = diabetes.target[:150]
lasso = linear_model.Lasso()
cross_val_score(lasso, X, y, cv=3)

X和y的尺寸为:

X.shape
# (150, 10)

y.shape

# (150,)

或每次观察训练集的目标值。