为什么以下代码的初始部分会运行,但是当我尝试运行以后的部分代码时会出现错误?我正在学习the page的数据挖掘,并尝试了解如何使用LGOCV选项执行交叉验证
library(mlbench)
data(Sonar)
str(Sonar)
library(caret)
set.seed(998)
inTraining <- createDataPartition(Sonar$Class, p = 0.75, list = FALSE)
training <- Sonar[inTraining, ]
testing <- Sonar[-inTraining, ]
fitControl <- trainControl(## 10-fold CV
method = "repeatedcv",
number = 10,
## repeated ten times
repeats = 10)
gbmGrid <- expand.grid(.interaction.depth = c(1, 5, 9),
.n.trees = (1:15)*100,
.shrinkage = 0.1)
fitControl <- trainControl(method = "repeatedcv",
number = 10,
repeats = 10,
## Estimate class probabilities
classProbs = TRUE,
## Evaluate performance using
## the following function
summaryFunction = twoClassSummary)
set.seed(825)
gbmFit3 <- train(Class ~ ., data = training,
method = "gbm",
trControl = fitControl,
verbose = FALSE,
tuneGrid = gbmGrid,
## Specify which metric to optimize
metric = "ROC")
gbmFit3
获取以下错误:(
datarow <- 1:nrow(training)
fitControl <- trainControl(method = "LGOCV",
summaryFunction = twoClassSummary,
classProbs = TRUE,
index = list(TrainSet = datarow ),
savePredictions = TRUE)
gbmFit4 <- train(Class ~ ., data = training,
method = "gbm",
trControl = fitControl,
verbose = FALSE,
tuneGrid = gbmGrid,
## Specify which metric to optimize
metric = "ROC")
我的错误如下
Error in { :
task 1 failed - "arguments imply differing number of rows: 0, 1"
In addition: Warning messages:
1: In eval(expr, envir, enclos) :
predictions failed for TrainSet: interaction.depth=1, shrinkage=0.1, n.trees=1500 Error in 1:ncol(tmp) : argument of length 0
2: In eval(expr, envir, enclos) :
predictions failed for TrainSet: interaction.depth=5, shrinkage=0.1, n.trees=1500 Error in 1:ncol(tmp) : argument of length 0
3: In eval(expr, envir, enclos) :
predictions failed for TrainSet: interaction.depth=9, shrinkage=0.1, n.trees=150
session info:
sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel splines stats graphics grDevices utils datasets methods base
other attached packages:
[1] gbm_2.1 survival_2.37-4 mlbench_2.1-1 pROC_1.5.4 caret_5.17-7 reshape2_1.2.2
[7] plyr_1.8 lattice_0.20-15 foreach_1.4.1 cluster_1.14.4
loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_3.0.1 grid_3.0.1 iterators_1.0.6 stringr_0.6.2 tools_3.0.1
答案 0 :(得分:1)
您还在CrossValidated上发布了相同的问题。我们通常会说,在寻求帮助之前确保您没有错误,然后联系包裹作者。
问题在于您使用datarow <- 1:nrow(training)
。您正在调整所有实例的模型,并且不会留下任何计算保留估计值的内容。
我不确定你要做什么。
最大