我正在尝试使用带有xgboost的R调查我的模型。一般来说训练模型运作良好,但是考虑到它是度量的一些问题。
我试图为类列设置一个因子,但仍然没有结果。
我的数据
ID var1var2TARGET
1 5 0 1
2 4 3 1
3 4 2 0
4 3 1 0
5 2 4 1
6 1 2 1
7 5 3 1
8 4 1 0
9 4 1 0
10 2 4 1
11 5 5 1
为此我做
train <- read.csv()
train.y <- train$TARGET
train$TARGET <- NULL
train$ID <- NULL
train.y <- lapply(train.y, factor)
然后我准备模型参数
xgb_grid_1 = expand.grid(
nrounds = 1000,
eta = c(0.01, 0.001, 0.0001),
max_depth = c(2, 4, 6, 8, 10),
gamma = 1
)
# pack the training control parameters
xgb_trcontrol_1 = trainControl(
method = "cv",
number = 5,
verboseIter = TRUE,
returnData = FALSE,
returnResamp = "all", # save losses across all models
classProbs = TRUE, # set to TRUE for AUC to be computed
summaryFunction = twoClassSummary,
allowParallel = TRUE
)
毕竟,我称之为火车功能
xgb_train_1 = train(
x = train,
y = train.y,
trControl = xgb_trcontrol_1,
tuneGrid = xgb_grid_1,
method = "xgbTree"
)
它给了我
Error in train.default(x = train, y = train.y, trControl = xgb_trcontrol_1, :
Metric RMSE not applicable for classification models
为什么会这样?
答案 0 :(得分:14)
您应该尝试将train.y <- lapply(train.y, factor)
更改为train.y <- factor(train.y, labels = c("yes", "no"))
。
caret
通常会抱怨标签是0还是1,所以请尝试更改它。