如何在训练/测试拆分后使用交叉验证验证

时间:2021-04-15 09:21:56

标签: indexing cross-validation train-test-split

在将数据拆分为训练和测试后,我在训练集上使用了 K-cross 验证。但这给出了一个错误,我认为这是由于训练和测试拆分后的索引。下面是我使用的代码。在火车/火车拆分后如何重置索引或任何其他处理此错误的建议将不胜感激。我已经尝试过 df.reset_index() 但这给出了一个错误 AttributeError: 'numpy.ndarray' object has no attribute 'reset_index'。 谢谢。

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=99)

# k-fold cross validation
scores = list()
kfold = KFold(n_splits=10, shuffle=True)
# enumerate splits
for train_ix, test_ix in kfold.split(X_train):

    train_X, test_X = X_train[train_ix], X_train[test_ix]
    train_y, test_y = y_train[train_ix], y_train[test_ix]
    # fit model
    model = LinearRegression()
    model.fit(train_X, train_y)
    # evaluate model
    yhat = model.predict(test_X)
    score = np.sqrt(metrics.mean_absolute_error(yhat, test_y))
    print('Fold score : {}'.format(score))

KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Int64Index([    3,     9,    10,    17,    19,\n            ...\n            41050, 41056, 41060, 41101, 41120],\n           dtype='int64', length=3708).

0 个答案:

没有答案