R caret train evalSummaryFunction中的错误:无法计算回归的类概率

时间:2014-05-19 11:57:35

标签: r r-caret

> cv.ctrl <- trainControl(method = "repeatedcv", repeats = 3,
+                         summaryFunction = twoClassSummary,
+                         classProbs = TRUE)
> 
> set.seed(35)
> glm.tune.1 <- train(y ~ bool_3,
+                     data = train.batch,
+                     method = "glm",
+                     metric = "ROC",
+                     trControl = cv.ctrl)
Error in evalSummaryFunction(y, trControl, classLevels, metric, method) : 
  train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl()
In addition: Warning message:
In train.default(x, y, weights = w, ...) :
  cannnot compute class probabilities for regression


 > str(train.batch)
'data.frame':   128046 obs. of  42 variables:
 $ offer               : int  1194044 1194044 1194044 1194044 1194044 1194044 1194044 1194044 1194044 1194044 ...
 $ avgPrice            : num  2.68 2.68 2.68 2.68 2.68 ...
 ...
 $ bool_3              : int  0 0 0 0 0 0 0 1 0 0 ...
 $ y                   : num  0 1 0 0 0 1 1 1 1 0 ...

由于cv.ctrl的classProbs设置为TRUE,我不明白为什么会出现此错误消息。

有人可以提供建议吗?

2 个答案:

答案 0 :(得分:5)

显然这个错误是因为我的 y不是因素

以下代码正常运行:

library(caret)
library(mlbench)
data(Sonar)

ctrl <- trainControl(method = "cv", 
                     summaryFunction = twoClassSummary, 
                     classProbs = TRUE)
set.seed(1)
gbmTune <- train(Class ~ ., data = Sonar,
                 method = "gbm",
                 metric = "ROC",
                 verbose = FALSE,                    
                 trControl = ctrl)

然后做:

Sonar$Class = as.numeric(Sonar$Class)

并且相同的代码抛出错误:

> gbmTune <- train(Class ~ ., data = Sonar,
+                  method = "gbm",
+                  metric = "ROC",
+                  verbose = FALSE,                    
+                  trControl = ctrl)
Error in evalSummaryFunction(y, trControl, classLevels, metric, method) : 
  train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl()
In addition: Warning message:
In train.default(x, y, weights = w, ...) :
  cannnot compute class probabilities for regression

但是,插入火车文件说:

y   a numeric or factor vector containing the outcome for each sample.

答案 1 :(得分:1)

如果您将y中的值分别更改为“YES”和“NO”而不是1和0,则代码将会运行。

y=ifelse(train.batch$y==0,"No","Yes")