使用h2o.kmeans获取越界错误

时间:2017-07-06 05:08:22

标签: r out-of-memory k-means h2o

尝试使用h2o包执行kmeans。这是关于我的h2o集群的信息:

java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

Starting H2O JVM and connecting: . Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         5 seconds 382 milliseconds 
    H2O cluster version:        3.10.5.3 
    H2O cluster version age:    5 days  
    H2O cluster name:           H2O_started_from_R_rgb505 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   14.22 GB 
    H2O cluster total cores:    4 
    H2O cluster allowed cores:  4 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    R Version:                  R version 3.4.1 (2017-06-30) 

我的数据是[32000,14]。所以,非常小。 尝试执行h2o.kmeans

时出现以下错误
h2o_kmeans <- h2o.kmeans(training_frame = spmx_train.h2o, 
                         nfolds = 10,
                         k = 20,
                         estimate_k = TRUE,
                         max_iterations = 10,
                         standardize = FALSE
)

错误:

java.lang.ArrayIndexOutOfBoundsException: 6

java.lang.ArrayIndexOutOfBoundsException: 6
  at water.util.ArrayUtils.add(ArrayUtils.java:163)
  at hex.ModelMetricsClustering$MetricBuilderClustering.reduce(ModelMetricsClustering.java:131)
  at hex.ModelMetricsClustering$MetricBuilderClustering.reduce(ModelMetricsClustering.java:80)
  at hex.ModelBuilder.cv_mainModelScores(ModelBuilder.java:512)
  at hex.ModelBuilder.computeCrossValidation(ModelBuilder.java:292)
  at hex.ModelBuilder$1.compute2(ModelBuilder.java:207)
  at water.H2O$H2OCountedCompleter.compute(H2O.java:1256)
  at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
  at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
  at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
  at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
  at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

Error: java.lang.ArrayIndexOutOfBoundsException: 6

当我将nfolds更改为5时,它运行正常。 所以,存在某种内存问题。 很难相信h2o无法在如此小的数据上处理1倍的kmeans。 有时随机运行代码。我关闭了所有其他应用程序,只运行了R.我有什么办法可以改进吗?

0 个答案:

没有答案