我正在将 XGBRegressor 与管道一起使用。管道包含预处理步骤和模型( XGBRegressor )。
下面是完整的预处理步骤。 (我已经定义了 numeric_cols 和 cat_cols )
numerical_transfer = SimpleImputer()
cat_transfer = Pipeline(steps = [
('imputer', SimpleImputer(strategy = 'most_frequent')),
('onehot', OneHotEncoder(handle_unknown = 'ignore'))
preprocessor = ColumnTransformer(
transformers = [
('num', numerical_transfer, numeric_cols),
('cat', cat_transfer, cat_cols)
my_model = Pipeline(steps = [('preprocessor', preprocessor), ('model', model)])
当我尝试不使用 early_stopping_rounds 进行调整时,代码工作正常。
(my_model.fit(X_train, y_train))
但是当我使用 early_stopping_rounds 时,如下所示,我遇到了错误。
my_model.fit(X_train, y_train, model__early_stopping_rounds=5, model__eval_metric = "mae", model__eval_set=[(X_valid, y_valid)])
model__eval_set=[(X_valid, y_valid)]) and the error is
ValueError: DataFrame.dtypes for data must be int, float or bool.
Did not expect the data types in fields MSZoning, Street, Alley, LotShape, LandContour, Utilities, LotConfig, LandSlope, Condition1, Condition2, BldgType, HouseStyle, RoofStyle, RoofMatl, MasVnrType, ExterQual, ExterCond, Foundation, BsmtQual, BsmtCond, BsmtExposure, BsmtFinType1, BsmtFinType2, Heating, HeatingQC, CentralAir, Electrical, KitchenQual, Functional, FireplaceQu, GarageType, GarageFinish, GarageQual, GarageCond, PavedDrive, PoolQC, Fence, MiscFeature, SaleType, SaleCondition
答案 0 :(得分:1)
\caption{Descriptive Statistics of Mutual Fund Survival Times}
& Total Months & Min & Max & Median & Mean & Mean S.E & Std. D \\ \hline
Number of Months & 330.00 & 36.00 & 329.00 & 111.00 & 133.34 & 2.92 & 79.06 \\ \hline
# Make a copy to avoid changing original data
# Remove the model from pipeline
eval_set_pipe = Pipeline(steps = [('preprocessor', preprocessor)])
# fit transform X_valid.copy()
X_valid_eval = eval_set_pipe.fit(X_train, y_train).transform (X_valid_eval)