插入符号中的train函数返回错误消息

时间:2017-07-21 15:21:19

标签: r r-caret

我正在使用插入符号train()函数来查找CART决策树的最佳cp值,该CART决策树采用自定义函数作为F1的度量。 train()函数返回一个我无法理解的错误。也许问题在于我定义的方式我提供了一个可重复的例子,我将非常感谢你的建议。

> library(data.table)
> library(ROSE)
> data(hacide)
> train <- hacide.train
> test <- hacide.test
> numFolds = trainControl(method = "cv" , number = 10)
> cpGrid = expand.grid(.cp = seq(0.01, 0.5, 0.01))
> f1 <- function(data, lev = NULL, model = NULL) {
+   f1_val <- F1_Score(y_pred = data$pred, y_true = data$obs, positive = lev[1])
+   c(F1 = f1_val)
+ }
> set.seed(12)
> train(cls ~ ., data = train,
+              method = "rpart",
+              tuneLength = 5,
+              metric = "F1",
+              trControl = trainControl(summaryFunction = f1, 
+                                       classProbs = TRUE))
Error in train.default(x, y, weights = w, ...) : 
  At least one of the class levels is not a valid R variable name; This will cause errors when class probabilities are generated because the variables names will be converted to  X0, X1 . Please use factor levels that can be used as valid R variable names  (see ?make.names for help).
> levels(train$cls)
[1] "0" "1"
> class(train$cls)
[1] "factor"

1 个答案:

答案 0 :(得分:0)

您可以尝试以下方法:

levels(train$cls) <- make.names(levels(train$cls)) 

然后运行模型,这将解决您的问题,不幸的是,您的示例无法重现,因为您错过了问题中的F1_Score函数定义。看看是否可行。

以下内容对我有用:

levels(train$cls) <- make.names(levels(train$cls)) 
set.seed(12)
train(cls ~ ., data = train,method = "rpart",tuneLength = 5,
                     metric = "ROC", trControl = trainControl(summaryFunction = twoClassSummary,  classProbs = TRUE))