使用Caret训练不同的ML模型

时间:2019-08-28 10:04:46

标签: r random-forest r-caret gbm

我正在尝试使用Caret中的trainControl和train函数来训练几个ML模型,但是我总是会得到与刚才说的相同的错误:

错误:正在停止

没有提供更多详细信息。

问题对于gbm和Ranger是相同的,因此我想知道这是否与我也在代码中使用的软件包冲突有关。

library(ggplot2)
library(lattice)
library(caret)
library(rlang)
library(tidyverse)
library(Matrix)
library(glmnet)
library(iterators)
library(parallel)
library(doParallel) # parallel processing.
registerDoParallel(cores=16)
library(randomForest)
library(gbm)
library(ranger)
library(data.table)
library(smooth)

    data<- data.frame(A=seq(as.Date("2019-01-01"), by=1, len=100),B=as.numeric(runif(100, 50, 150)),C=as.numeric(runif(100, 50, 150)))

  # define data sets
    data_training<-data[1:60,]
    data_test<-data[(60+1):nrow(data),]

  # creating sampling seeds
  set.seed(123)

  n=nrow(data_training)
  tuneLength.num <- 5

  seeds <- vector(mode = "list", length = n) # creates an empty vector containing lists
  for(i in 1:(n-1)){ # choose tuneLength.num random samples from 1 to 1000
    seeds[[i]] <- sample.int(1000, tuneLength.num) 
  }

  # For the last model:
  seeds[[n]] <- sample.int(1000, 10)

  # Define TimeControl for training and fitting:
  trainingTimeControl <- trainControl(method = "timeslice",
                                      initialWindow = 25,
                                      horizon = 1,
                                      fixedWindow = TRUE,
                                      returnResamp="all",
                                      allowParallel = TRUE,
                                      seeds = seeds,
                                      savePredictions = TRUE)

  gbm.mod<- caret::train(B ~.- A,
                             data = data_training,
                             method = "gbm",
                             distribution = "gaussian",
                             trControl = trainingTimeControl,
                             tuneLength=tuneLength.num, 
                             metric="RMSE")

编辑:以下代码可以正常工作:

gbm<-gbm(formula = B ~  . - A, 
                  distribution = "gaussian",
                  data = data_training,
                  keep.data = TRUE)

如果有人知道这里发生了什么,那就太好了。该代码在svmRadial上正常工作。

0 个答案:

没有答案