在运行带有xgbTree和自定义交叉验证折叠的递归特征提取(RFE)时,我遇到了上述错误。以下是一些可复制的代码:
library(mlbench)
library(plyr)
library(caret)
library(xgboost)
data(PimaIndiansDiabetes)
X=PimaIndiansDiabetes
partition=createDataPartition(X$diabetes, p=0.8)
df_train=X[partition$Resample1,]
train_target_id=1:dim(df_train)[1]
train_target=df_train$diabetes
df_train_x=df_train[,-9]
cases=train_target_id[df_train$diabetes=='pos']
controls=train_target_id[df_train$diabetes=='neg']
set.seed(1)
custom.folds=createFolds(controls, k=10)
set.seed(1)
for(i in 1:10){
custom.folds[[i]]=c(controls[custom.folds[[i]]],cases)
i=i+1
}
caretFuncs$summary<-twoClassSummary
rctrl1 <- rfeControl(method = "cv",
number = 10,
index = custom.folds,
functions = caretFuncs,
saveDetails = TRUE,
verbose=TRUE,
returnResamp = "final")
tctrl1 <- trainControl(method = "cv",
number = 10,
index = custom.folds,
verbose = F,
classProbs = TRUE,
summaryFunction = twoClassSummary)
xgbgrid1 = expand.grid(
nrounds = 5,
max_depth = c(5, 10, 15), eta = 0.001, gamma = c(1, 2, 3), colsample_bytree = c(0.4, 0.7, 1.0),
min_child_weight = c(0.5, 1, 1.5), subsample = 0.5)
fit_xg <- rfe(df_train_x, train_target,
metric = "ROC",
method = "xgbTree",
rfeControl = rctrl1,
trControl=tctrl1,
tuneGrid = xgbgrid1)
1:size中的错误:结果将是向量太长。另外: 50个或更多警告(使用warnings()查看前50个警告)
warnings()1:在train.default(x,y,...)中:度量“准确性”不在结果集中。将使用ROC代替。 2:模型拟合 对于Fold01失败:eta = 0.001,max_depth = 5,gamma = 1, colsample_bytree = 0.4,min_child_weight = 0.5,子样本= 0.5,nrounds = 5 xgb.iter.update(bst $ handle,dtrain,迭代-1,obj)中的错误:
[14:25:23]合并/../src/objective/regression_obj.cc:98:检查 失败:Loss :: CheckLabel(y)标签必须在[0,1]中以进行物流 回归...
>sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_Singapore.1252 LC_CTYPE=English_Singapore.1252 LC_MONETARY=English_Singapore.1252
[4] LC_NUMERIC=C LC_TIME=English_Singapore.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] xgboost_0.71.2 caret_6.0-81 ggplot2_3.1.0 lattice_0.20-38 plyr_1.8.4 mlbench_2.1-1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 gower_0.1.2 pillar_1.3.1 compiler_3.5.2 bindr_0.1.1 class_7.3-14
[7] iterators_1.0.10 tools_3.5.2 rpart_4.1-13 ipred_0.9-8 lubridate_1.7.4 tibble_2.0.1
[13] gtable_0.2.0 nlme_3.1-137 pkgconfig_2.0.2 rlang_0.3.1 Matrix_1.2-15 foreach_1.4.4
[19] rstudioapi_0.9.0 prodlim_2018.04.18 bindrcpp_0.2.2 withr_2.1.2 dplyr_0.7.8 stringr_1.3.1
[25] generics_0.0.2 recipes_0.1.4 stats4_3.5.2 nnet_7.3-12 grid_3.5.2 tidyselect_0.2.5
[31] glue_1.3.0 data.table_1.12.0 R6_2.3.0 survival_2.43-3 lava_1.6.4 purrr_0.2.5
[37] reshape2_1.4.3 magrittr_1.5 splines_3.5.2 MASS_7.3-51.1 scales_1.0.0 codetools_0.2-15
[43] ModelMetrics_1.2.2 assertthat_0.2.0 timeDate_3043.102 colorspace_1.4-0 stringi_1.2.4 lazyeval_0.2.1
[49] munsell_0.5.0 crayon_1.3.4
感谢任何人都可以提供帮助!