脱字号rfe错误-错误为1:size:结果向量过长

时间:2019-02-07 07:52:59

标签: r error-handling r-caret

在运行带有xgbTree和自定义交叉验证折叠的递归特征提取(RFE)时,我遇到了上述错误。以下是一些可复制的代码:

library(mlbench)
library(plyr)
library(caret)
library(xgboost)

data(PimaIndiansDiabetes)
X=PimaIndiansDiabetes

创建培训和测试集

partition=createDataPartition(X$diabetes, p=0.8)
df_train=X[partition$Resample1,]
train_target_id=1:dim(df_train)[1]
train_target=df_train$diabetes
df_train_x=df_train[,-9]

创建自定义10折简历集

cases=train_target_id[df_train$diabetes=='pos']
controls=train_target_id[df_train$diabetes=='neg']

set.seed(1)
custom.folds=createFolds(controls, k=10)

set.seed(1)
for(i in 1:10){
  custom.folds[[i]]=c(controls[custom.folds[[i]]],cases)
  i=i+1
}

Caret RFE代码

caretFuncs$summary<-twoClassSummary
rctrl1 <- rfeControl(method = "cv",
                     number = 10,
                     index = custom.folds,
                     functions = caretFuncs,
                     saveDetails = TRUE,
                     verbose=TRUE,
                     returnResamp = "final")

tctrl1 <- trainControl(method = "cv",
                       number = 10,
                       index = custom.folds,
                       verbose = F,
                       classProbs = TRUE,
                       summaryFunction = twoClassSummary)

xgbgrid1 = expand.grid(
  nrounds = 5, 
  max_depth = c(5, 10, 15), eta = 0.001, gamma = c(1, 2, 3), colsample_bytree = c(0.4, 0.7, 1.0), 
  min_child_weight = c(0.5, 1, 1.5), subsample = 0.5)

fit_xg <- rfe(df_train_x, train_target,
              metric = "ROC",
              method = "xgbTree",
              rfeControl = rctrl1,
              trControl=tctrl1,
              tuneGrid = xgbgrid1)
  

1:size中的错误:结果将是向量太长。另外:   50个或更多警告(使用warnings()查看前50个警告)

     
    

warnings()1:在train.default(x,y,...)中:度量“准确性”不在结果集中。将使用ROC代替。 2:模型拟合     对于Fold01失败:eta = 0.001,max_depth = 5,gamma = 1,     colsample_bytree = 0.4,min_child_weight = 0.5,子样本= 0.5,nrounds = 5     xgb.iter.update(bst $ handle,dtrain,迭代-1,obj)中的错误:
    [14:25:23]合并/../src/objective/regression_obj.cc:98:检查     失败:Loss :: CheckLabel(y)标签必须在[0,1]中以进行物流     回归...

  

会话信息

>sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_Singapore.1252  LC_CTYPE=English_Singapore.1252    LC_MONETARY=English_Singapore.1252
[4] LC_NUMERIC=C                       LC_TIME=English_Singapore.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] xgboost_0.71.2  caret_6.0-81    ggplot2_3.1.0   lattice_0.20-38 plyr_1.8.4      mlbench_2.1-1  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         gower_0.1.2        pillar_1.3.1       compiler_3.5.2     bindr_0.1.1        class_7.3-14      
 [7] iterators_1.0.10   tools_3.5.2        rpart_4.1-13       ipred_0.9-8        lubridate_1.7.4    tibble_2.0.1      
[13] gtable_0.2.0       nlme_3.1-137       pkgconfig_2.0.2    rlang_0.3.1        Matrix_1.2-15      foreach_1.4.4     
[19] rstudioapi_0.9.0   prodlim_2018.04.18 bindrcpp_0.2.2     withr_2.1.2        dplyr_0.7.8        stringr_1.3.1     
[25] generics_0.0.2     recipes_0.1.4      stats4_3.5.2       nnet_7.3-12        grid_3.5.2         tidyselect_0.2.5  
[31] glue_1.3.0         data.table_1.12.0  R6_2.3.0           survival_2.43-3    lava_1.6.4         purrr_0.2.5       
[37] reshape2_1.4.3     magrittr_1.5       splines_3.5.2      MASS_7.3-51.1      scales_1.0.0       codetools_0.2-15  
[43] ModelMetrics_1.2.2 assertthat_0.2.0   timeDate_3043.102  colorspace_1.4-0   stringi_1.2.4      lazyeval_0.2.1    
[49] munsell_0.5.0      crayon_1.3.4

感谢任何人都可以提供帮助!

0 个答案:

没有答案