R不是插入符函数的有效变量名

时间:2016-04-15 22:34:20

标签: r r-caret

我想使用train caret函数来调查xgboost结果

#open file with train data
trainy <- read.csv('')
# open file with test data
test <- read.csv('')

# we dont need ID column

##### Removing IDs
trainy$ID <- NULL
test.id <- test$ID
test$ID <- NULL

##### Extracting TARGET
trainy.y <- trainy$TARGET

trainy$TARGET <- NULL


# set up the cross-validated hyper-parameter search
xgb_grid_1 = expand.grid(
  nrounds = 1000,
  eta = c(0.01, 0.001, 0.0001),
  max_depth = c(2, 4, 6, 8, 10),
  gamma = 1
)

# pack the training control parameters
xgb_trcontrol_1 = trainControl(
  method = "cv",
  number = 5,
  verboseIter = TRUE,
  returnData = FALSE,
  returnResamp = "all",                                                        # save losses across all models
  classProbs = TRUE,                                                           # set to TRUE for AUC to be computed
  summaryFunction = twoClassSummary,
  allowParallel = TRUE
)

# train the model for each parameter combination in the grid, 
#   using CV to evaluate
xgb_train_1 = train(
  x = as.matrix(trainy),
  y = as.factor(trainy.y),
  trControl = xgb_trcontrol_1,
  tuneGrid = xgb_grid_1,
  method = "xgbTree"
)

我看到了这个错误

Error in train.default(x = as.matrix(trainy), y = as.factor(trainy.y), trControl = xgb_trcontrol_1,  : 
  At least one of the class levels is not a valid R variable name;

我看过其他案例,但仍然无法理解我应该改变什么? R现在对我来说与Python完全不同

我可以看到我应该用y类变量做一些事情,但具体是什么?为什么as.factor功能不起作用?

1 个答案:

答案 0 :(得分:0)

我解决了这个问题,希望它对所有新手都有帮助

我需要像

那样将所有数据转换为因子类型
trainy[] <- lapply(trainy, factor)