我正在构建预测模型。在获得交叉验证分数之前,我一直努力达到目标。现在我不知道如何继续。我应该使用什么功能来使用交叉验证得分进行预测?
X = data.iloc[:,0:16]
Y = data.iloc[:,16]
validation_size = 0.20
seed = 7
X_train, X_validation, Y_train, Y_validation = model_selection.train_test_split(X, Y,
test_size=validation_size, random_state=seed)
models = [
('LR', LogisticRegression()),
('CART', DecisionTreeClassifier()),
('KNN', KNeighborsClassifier()),
('SVM', SVC())
]
results, names = [], []
for name, model in models:
seed = 32
scoring = 'accuracy'
kfold = model_selection.KFold(n_splits=10, random_state=seed)
cv_results = model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)
results.append(cv_results)
names.append(name)
msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
print(msg)
答案 0 :(得分:0)
交叉验证通常用作更可靠的验证方案,以检查您的模型是否运行良好。之后,您可以在对交叉验证得分感到满意之后使用整个数据集训练模型,或者可以使用。
sklearn.model_selection.cross_val_predict
预测交叉验证的估计。您可以查看documentation了解更多信息。