我不断收到一个错误,提示regr.xgboost不支持因子输入,我正在尝试使用Hyper-paramter来确定火车模型的参数值。
train <- createDummyFeatures(train)
test <- createDummyFeatures(test)
train_task <- makeRegrTask(data = train, target = "purchase")
xgb_learner <- makeLearner("regr.xgboost", nrounds = 10, nthread = 1,
par.vals = list(objective = "reg:linear", eval_metric ="rmse"))
xgb_param_set <- makeParamSet(makeIntegerParam("nrounds", lower = 100, upper = 500),
makeIntegerParam("max_depth", lower = 1, upper = 10),
makeNumericParam("eta", lower = 0.1, upper = 0.3),
makeNumericParam("lambda", lower = -1, upper = 0, trafo = function(x) 10^x))
tune_control <- makeTuneControlRandom(maxit = 10)
resample_desc <- makeResampleDesc("CV", iters = 4)
一切正常,直到我运行tuneParams函数,不知道我缺少什么。我使用createdummyfeature将因素变量变成了虚拟变量
tuning_params <- tuneParams(learner = xgb_learner, task = train_task,
resampling = resample_desc, par.set = xgb_param_set, control = tune_control)
xgb_Hyper_model <- train(setHyperPars(learner = xgb_learner, par.vals = tuning_params$x), train_task)
这是代码的输出:
xgb_learner <- makeLearner("regr.xgboost", nrounds = 10, nthread = 1,
+ par.vals = list(objective = "reg:linear", eval_metric ="rmse"))
> xgb_param_set <- makeParamSet(makeIntegerParam("nrounds", lower = 100, upper = 500),
+ makeIntegerParam("max_depth", lower = 1, upper = 10),
+ makeNumericParam("eta", lower = 0.1, upper = 0.3),
+ makeNumericParam("lambda", lower = -1, upper = 0, trafo = function(x) 10^x))
> tune_control <- makeTuneControlRandom(maxit = 10)
> resample_desc <- makeResampleDesc("CV", iters = 4)
> tuning_params <- tuneParams(learner = xgb_learner, task = train_task,
+ resampling = resample_desc, par.set = xgb_param_set, control = tune_control)
[Tune] Started tuning learner regr.xgboost for parameter set:
Type len Def Constr Req Tunable Trafo
nrounds integer - - 100 to 500 - TRUE -
max_depth integer - - 1 to 10 - TRUE -
eta numeric - - 0.1 to 0.3 - TRUE -
lambda numeric - - -1 to 0 - TRUE Y
With control class: TuneControlRandom
Imputation value: Inf
[Tune-x] 1: nrounds=400; max_depth=6; eta=0.125; lambda=0.221
Error in checkLearnerBeforeTrain(task, learner, weights) :
Task 'train' has factor inputs in 'gender, age, city_category, stay_in_current_cit...', but learner 'regr.xgboost' does not support that!
数据的前6行:
occupation marital_status product_category_1 product_category_2 product_category_3 purchase prop_aboveavg_occ prop_aboveavg_age
1 10 0 3 NA NA 8370 0.5000000 0.5000000
2 10 0 1 6 14 15200 0.5333333 0.5263158
3 10 0 12 NA NA 1422 0.7500000 0.8000000
4 10 0 12 14 NA 1057 0.3636364 0.5000000
5 16 0 8 NA NA 7969 0.9230769 1.0000000
6 15 0 1 2 NA 15227 0.5454545 0.5543071
prop_aboveavg_gender prop_aboveavg_city_category prop_aboveavg_marital_status prop_aboveavg_stay_in_current_city_years gender.F gender.M
1 0.5833333 0.5468750 0.5419847 0.5625000 1 0
2 0.4820144 0.4213836 0.4244186 0.4343434 1 0
3 0.6851852 0.6250000 0.6250000 0.7647059 1 0
4 0.5000000 0.3875000 0.4269006 0.3835616 1 0
5 0.7302632 0.7681159 0.7211538 0.7857143 0 1
6 0.5955734 0.5714286 0.6250000 0.5803571 0 1
age.0.17 age.18.25 age.26.35 age.36.45 age.46.50 age.51.55 age.55. city_category.A city_category.B city_category.C
1 1 0 0 0 0 0 0 1 0 0
2 1 0 0 0 0 0 0 1 0 0
3 1 0 0 0 0 0 0 1 0 0
4 1 0 0 0 0 0 0 1 0 0
5 0 0 0 0 0 0 1 0 0 1
6 0 0 1 0 0 0 0 1 0 0
stay_in_current_city_years.0 stay_in_current_city_years.1 stay_in_current_city_years.2 stay_in_current_city_years.3
1 0 0 1 0
2 0 0 1 0
3 0 0 1 0
4 0 0 1 0
5 0 0 0 0
6 0 0 0 1
stay_in_current_city_years.4.
1 0
2 0
3 0
4 0
5 1
6 0