为什么使用XGBClassifier时cross_validate的输出与硬代码循环不同?

时间:2019-10-08 12:52:15

标签: python numpy scikit-learn xgbclassifier

代码#1 通过PCA,XGBClassifier步骤传递管道以进行scikit学习cross_validate函数

from xgboost import XGBClassifier
from sklearn.model_selection import cross_validate, LeaveOneOut
from sklearn.pipeline import Pipeline
from sklearn.decomposition import PCA

import random
random.seed(42)
import numpy as np
np.random.seed(42)

kwargs = {
    'n_jobs': -1,
    'cv': LeaveOneOut(),
    'X': X,
    'y': y
}

pipe = Pipeline([
    ('pca', PCA(1, random_state=42)),
    ('xgbc', XGBClassifier(random_state=42))
])

results = cross_validate(pipe, **kwargs)
print(results['test_score'].mean())

代码#2 编写交叉验证循环硬代码,并计算与代码#1

完全相同的输入X的平均准确度
from xgboost import XGBClassifier
from sklearn.model_selection import LeaveOneOut
from sklearn.decomposition import PCA

import random
random.seed(42)
import numpy as np
np.random.seed(42)

acc = []
for train_idx, test_idx in LeaveOneOut().split(X, y):

    x_train, x_test = X[train_idx], X[test_idx]
    y_train, y_test = y[train_idx], y[test_idx]

    pca = PCA(1, random_state=42)
    pca.fit(x_train)
    x_train = pca.transform(x_train)
    x_test = pca.transform(x_test)

    model = XGBClassifier(random_state=42, n_jobs=-1)
    model.fit(x_train, y_train)

    score = model.score(x_test, y_test)
    acc.append(score)

print(np.mean(acc))

0 个答案:

没有答案