运行h2o.ensemble时出错

时间:2015-12-09 05:55:23

标签: r machine-learning h2o

在R中运行h2o.ensemble时遇到错误。这是错误输出

[1] "Cross-validating and training base learner 1: h2o.glm.wrapper"
  |======================================================================| 100%
[1] "Cross-validating and training base learner 2: h2o.randomForest.1"
  |==============                                                        |  19%

Got exception 'class java.lang.AssertionError', with msg 'null'
java.lang.AssertionError
    at hex.tree.DHistogram.scoreMSE(DHistogram.java:323)
    at hex.tree.DTree$DecidedNode$FindSplits.compute2(DTree.java:441)
    at hex.tree.DTree$DecidedNode.bestCol(DTree.java:421)
    at hex.tree.DTree$DecidedNode.<init>(DTree.java:449)
    at hex.tree.SharedTree.makeDecided(SharedTree.java:489)
    at hex.tree.SharedTree$ScoreBuildOneTree.onCompletion(SharedTree.java:436)
    at jsr166y.CountedCompleter.__tryComplete(CountedCompleter.java:425)
    at jsr166y.CountedCompleter.tryComplete(CountedCompleter.java:383)
    at water.MRTask.compute2(MRTask.java:683)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1069)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)


Error: 'null'

这是我正在使用的代码。我正在使用此脚本来回归问题。 &#34;销售&#34;列用于输出预测。其余的栏目用于训练。

response <- "Sales"
predictors <- setdiff(names(train), response)

h2o.glm.1 <- function(..., alpha = 0.0) h2o.glm.wrapper(..., alpha = alpha)
h2o.glm.2 <- function(..., alpha = 0.5) h2o.glm.wrapper(..., alpha = alpha)
h2o.glm.3 <- function(..., alpha = 1.0) h2o.glm.wrapper(..., alpha = alpha)
h2o.randomForest.1 <- function(..., ntrees = 200, nbins = 50, seed = 1) h2o.randomForest.wrapper(..., ntrees = ntrees, nbins = nbins, seed = seed)
h2o.randomForest.2 <- function(..., ntrees = 200, sample_rate = 0.75, seed = 1) h2o.randomForest.wrapper(..., ntrees = ntrees, sample_rate = sample_rate, seed = seed)
h2o.gbm.1 <- function(..., ntrees = 100, seed = 1) h2o.gbm.wrapper(..., ntrees = ntrees, seed = seed)
h2o.gbm.6 <- function(..., ntrees = 100, col_sample_rate = 0.6, seed = 1) h2o.gbm.wrapper(..., ntrees = ntrees, col_sample_rate = col_sample_rate, seed = seed)
h2o.gbm.8 <- function(..., ntrees = 100, max_depth = 3, seed = 1) h2o.gbm.wrapper(..., ntrees = ntrees, max_depth = max_depth, seed = seed)
h2o.deeplearning.1 <- function(..., hidden = c(500,500), activation = "Rectifier", epochs = 50, seed = 1)  h2o.deeplearning.wrapper(..., hidden = hidden, activation = activation, seed = seed)
h2o.deeplearning.6 <- function(..., hidden = c(50,50), activation = "Rectifier", epochs = 50, seed = 1)  h2o.deeplearning.wrapper(..., hidden = hidden, activation = activation, seed = seed)
h2o.deeplearning.7 <- function(..., hidden = c(100,100), activation = "Rectifier", epochs = 50, seed = 1)  h2o.deeplearning.wrapper(..., hidden = hidden, activation = activation, seed = seed)

print("learning starts ")
#### Customized base learner library
learner <- c("h2o.glm.wrapper",
             "h2o.randomForest.1", "h2o.randomForest.2",
             "h2o.gbm.1", "h2o.gbm.6", "h2o.gbm.8",
             "h2o.deeplearning.1", "h2o.deeplearning.6", "h2o.deeplearning.7")

metalearner <- "h2o.glm.wrapper"
#
#Train with new library:
fit <- h2o.ensemble(
  x =  predictors, 
  y= response,
  training_frame=train,
  family = "gaussian", 
  learner = learner, 
  metalearner = metalearner,
  cvControl = list(V = 5))

列车数据的所有列都是数字。我使用的是R版本3.2.2。

2 个答案:

答案 0 :(得分:4)

根据Spencer Aiello

的建议

在h2o初始化中将断言设置为FALSE可能会起作用

h2o.init(nthreads=-1, assertion = FALSE)

确保在应用更改之前正确关闭/重启h2o

h2o.shutdown()
h2o.init(nthreads=-1, assertion = FALSE)

答案 1 :(得分:3)

更新的方法是

h2o.init(nthreads=-1,enable_assertions = FALSE)