在循环中运行gbm并计算r中每个模型的预测值

时间:2015-10-01 13:13:38

标签: r gbm

我试图在R中以不同的学习速率在循环中制作gbm模型。 我想为每个模型计算一些统计数据,并将它们与原始数据集合并。

但是我'由于每次计算统计数据时都会出现错误,因此保存的名称与前一个名称相同,因此存在错误。

我在循环结束时收到以下错误:

Error in `[<-.data.frame`(`*tmp*`, nl, value = list(dates = c(14824, 14825,  : 
  duplicate subscripts for columns

列车数据基本上是股票价格数据,包括日期,高开收盘等。

以下是代码:

   learningRateList <- as.numeric(7:9)*0.01

for (i in learningRateList){
  modelNames <- paste("gbmModel", i, sep = "") 
  gbmModels <-gbm.step(data=train, gbm.x = reqCol, gbm.y = CloseCol,tree.complexity =9,learning.rate = i,bag.fraction = 0.75,family ="laplace",step.size=100 )
  assign(modelNames, gbmModels)

  #training data
  #predict values for the training data set
  predTrainGbm<-paste("gbmTrainPrediction", i, sep = "")
  gbmTrainPrediction <- predict.gbm(gbmModels,train,n.trees=gbmModels$gbm.call$best.trees)
  assign(predTrainGbm,gbmTrainPrediction)
  #calculate mape for the predictions
  mapeTrain<-paste("mapeGbmTrain", i, sep = "")
  mapeTrainGbm<-regr.eval(train$Close,gbmTrainPrediction,stats = "mape")
  assign(mapeTrain,mapeTrainGbm)

  train<-cbind(train,predTrainGbm,mapeTrain)

  #creating plots of actual vs predicted values
  imageGbmName<-paste(fileCalculated,"Gbm Prediction",i,".png")
  png(imageGbmName)

  par(mfrow=c(2,1))
  plot(train$Close,type="l",col="red",main = "Training set") 
  lines(gbmTrainPrediction,col="green")

  plot(test$Close,type="l",col="red",main = "Test Set") 
  lines(gbmTestPrediction,col="green")


  dev.off()
}

0 个答案:

没有答案