这是我的代码。
library(dplyr)
library(caret)
library(xgboost)
data(agaricus.train, package = "xgboost")
data(agaricus.test, package='xgboost')
train <- agaricus.train
test <- agaricus.test
xgb_grid_1 <- expand.grid(
nrounds = c(1:10),
eta = c(seq(0,1,0.1)),
max_depth = c(2:5),
gamman = c(seq(0,1,0.1))
)
xgb_trcontrol_1 <- trainControl(
method = "cv",
number = 5,
verboseIter = TRUE,
returnData = FALSE,
returnResamp = "all",
classProbs = TRUE,
summaryFunction = twoClassSummary,
allowParallel = TRUE
)
xgb_train1 <- train(
x = as.matrix(train$data),
y = train$label,
trControl = xgb_trcontrol_1,
tune_grid = xgb_grid_1,
method = 'xgbTree'
)
编译xgb_train1时,出现messafe错误
Error in frankv(predicted) : x is a list, 'cols' can not be 0-length
In addition: Warning messages:
1: In train.default(x = train$data, y = train$label, trControl = xgb_trcontrol_1, :
You are trying to do regression and your outcome only has two possible values Are you trying to do classification? If so, use a 2 level factor as your outcome column.
2: In train.default(x = train$data, y = train$label, trControl = xgb_trcontrol_1, :
cannnot compute class probabilities for regression
我该怎么办?请通知我
答案 0 :(得分:1)
您的代码有几个问题。
caret::train
没有tune_grid
参数,而是tuneGrid
target
。这是错误消息告诉您的内容: You are trying to do regression and your outcome only has two possible values Are you trying to do classification? If so, use a 2 level factor as your outcome column.
这是应该起作用的代码:
library(caret)
library(xgboost)
data(agaricus.train, package = "xgboost")
data(agaricus.test, package='xgboost')
train <- agaricus.train
test <- agaricus.test
train$label <- ifelse(train$label == 0, "no", "yes") #convert target to character or factor
xgb_grid_1 = expand.grid(
nrounds = 100,
eta = c(0.01, 0.001, 0.0001),
max_depth = c(2, 4, 6, 8, 10),
gamma = 1,
colsample_bytree = 0.6,
min_child_weight = 1,
subsample = 0.75
)
xgb_trcontrol_1 <- trainControl(
method = "cv",
number = 3,
search = "grid",
verboseIter = TRUE,
returnData = FALSE,
returnResamp = "all",
classProbs = TRUE,
summaryFunction = twoClassSummary
)
xgb_train1 <- caret::train(
x = as.matrix(train$data),
y = train$label,
trControl = xgb_trcontrol_1,
tuneGrid = xgb_grid_1,
metric ="ROC",
method = 'xgbTree'
)
#output
eXtreme Gradient Boosting
No pre-processing
Resampling: Cross-Validated (3 fold)
Summary of sample sizes: 4343, 4341, 4342
Resampling results across tuning parameters:
eta max_depth ROC Sens Spec
1e-04 2 0.9963189 0.9780604 0.9656045
1e-04 4 0.9999604 0.9985172 0.9974527
1e-04 6 1.0000000 1.0000000 0.9974527
1e-04 8 1.0000000 1.0000000 0.9974527
1e-04 10 1.0000000 1.0000000 0.9974527
1e-03 2 0.9972687 0.9629358 0.9713391
1e-03 4 0.9999479 0.9985172 0.9974527
1e-03 6 1.0000000 1.0000000 0.9974527
1e-03 8 1.0000000 1.0000000 0.9974527
1e-03 10 1.0000000 1.0000000 0.9977714
1e-02 2 0.9990705 0.9780604 0.9757951
1e-02 4 0.9999674 1.0000000 0.9974527
1e-02 6 1.0000000 1.0000000 0.9977714
1e-02 8 1.0000000 1.0000000 0.9977714
1e-02 10 1.0000000 1.0000000 0.9977714
Tuning parameter 'nrounds' was held constant at a value of 100
Tuning parameter 'gamma' was held constant at a value of 1
Tuning
parameter 'colsample_bytree' was held constant at a value of 0.6
Tuning parameter 'min_child_weight' was held constant at a value of
1
Tuning parameter 'subsample' was held constant at a value of 0.75
ROC was used to select the optimal model using the largest value.
The final values used for the model were nrounds = 100, max_depth = 6, eta = 1e-04, gamma = 1, colsample_bytree = 0.6, min_child_weight
= 1 and subsample = 0.75.