Question

我正在使用xgboost和sklearn参加Kaggle比赛（https://www.kaggle.com/c/house-prices-advanced-regression-techniques#evaluation）。

具体来说，我使用GridSearchCV为我的XGBRegressor模型尝试不同的超参数。这就是我正在做的事情：

import pandas as pd
import numpy as np
import sklearn as sk
import matplotlib as plt
import xgboost as xgb
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import GridSearchCV
pd.options.display.max_rows = 100

xgb_model = xgb.XGBRegressor()
params = {"max_depth": [3, 4], "learning_rate": [0.05],
         "n_estimators": [1000, 2000], "n_jobs": [8], "subsample": [0.8], "random_state": [42]}
grid_search_cv = GridSearchCV(xgb_model, params, scoring="neg_mean_absolute_error",
                             n_jobs=8, cv=KFold(n_splits=10, shuffle=True, random_state=42), verbose=2)

grid_search_cv.fit(X, y)

所以我尝试了一些max_depth，learning_rate，n_estimators等。

奇怪的是，调用上面的.fit()会导致此输出：

GridSearchCV(cv=KFold(n_splits=10, random_state=42, shuffle=True),
       error_score='raise',
       estimator=XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=3, min_child_weight=1, missing=None, n_estimators=100,
       n_jobs=1, nthread=None, objective='reg:linear', random_state=0,
       reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
       silent=True, subsample=1),
       fit_params=None, iid=True, n_jobs=8,
       param_grid={'max_depth': [3, 4], 'learning_rate': [0.05], 'n_estimators': [1000, 2000], 'n_jobs': [8], 'subsample': [0.8], 'random_state': [42]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',
       scoring='neg_mean_absolute_error', verbose=2)

所以GridSearchCV输出自身，但使用超参数值的XGBRegressor实例我没有指定？

看起来这些是默认值，但我无法在.fit()文档中找到sklearn（仅.fit_params）...

任何指导都会很棒！

Scikit-Learn：GridSearchCV具有惊人的输出.Fit（）？

0 个答案: