Question

为什么以下代码的初始部分会运行，但是当我尝试运行以后的部分代码时会出现错误？我正在学习the page的数据挖掘，并尝试了解如何使用LGOCV选项执行交叉验证

library(mlbench)

 data(Sonar)

 str(Sonar)

 library(caret)

 set.seed(998)

 inTraining <- createDataPartition(Sonar$Class, p = 0.75, list = FALSE)

 training <- Sonar[inTraining, ]

 testing <- Sonar[-inTraining, ]

 fitControl <- trainControl(## 10-fold CV 
                           method = "repeatedcv", 
                           number = 10, 
                           ## repeated ten times 
                           repeats = 10) 


 gbmGrid <-  expand.grid(.interaction.depth = c(1, 5, 9),  
                         .n.trees = (1:15)*100,  
                         .shrinkage = 0.1) 

 fitControl <- trainControl(method = "repeatedcv",
                            number = 10,
                            repeats = 10,
                            ## Estimate class probabilities
                            classProbs = TRUE,
                            ## Evaluate performance using 
                            ## the following function
                            summaryFunction = twoClassSummary)   

 set.seed(825)

 gbmFit3 <- train(Class ~ ., data = training, 
                  method = "gbm", 
                  trControl = fitControl, 
                  verbose = FALSE, 
                  tuneGrid = gbmGrid,
                  ## Specify which metric to optimize
                  metric = "ROC")

 gbmFit3

获取以下错误：（

datarow <- 1:nrow(training)

fitControl <- trainControl(method = "LGOCV",
                     summaryFunction = twoClassSummary,
                     classProbs = TRUE,
                     index = list(TrainSet = datarow ),
                     savePredictions = TRUE)


gbmFit4 <- train(Class ~ ., data = training, 
                  method = "gbm", 
                  trControl = fitControl, 
                  verbose = FALSE, 
                  tuneGrid = gbmGrid,
                  ## Specify which metric to optimize
                  metric = "ROC")

我的错误如下

Error in { : 
  task 1 failed - "arguments imply differing number of rows: 0, 1"
In addition: Warning messages:
1: In eval(expr, envir, enclos) :
  predictions failed for TrainSet: interaction.depth=1, shrinkage=0.1, n.trees=1500 Error in 1:ncol(tmp) : argument of length 0

2: In eval(expr, envir, enclos) :
  predictions failed for TrainSet: interaction.depth=5, shrinkage=0.1, n.trees=1500 Error in 1:ncol(tmp) : argument of length 0

3: In eval(expr, envir, enclos) :
  predictions failed for TrainSet: interaction.depth=9, shrinkage=0.1, n.trees=150
session info:

sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gbm_2.1         survival_2.37-4 mlbench_2.1-1   pROC_1.5.4      caret_5.17-7    reshape2_1.2.2 
 [7] plyr_1.8        lattice_0.20-15 foreach_1.4.1   cluster_1.14.4 

loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_3.0.1  grid_3.0.1      iterators_1.0.6 stringr_0.6.2   tools_3.0.1

Answer 1

您还在CrossValidated上发布了相同的问题。我们通常会说，在寻求帮助之前确保您没有错误，然后联系包裹作者。

问题在于您使用datarow <- 1:nrow(training)。您正在调整所有实例的模型，并且不会留下任何计算保留估计值的内容。

我不确定你要做什么。

最大

LGOCV插入火车

1 个答案: