尝试在H2O中由Java运行AutoML时始终遇到java.lang.ArrayIndexOutOfBoundsException

时间:2018-08-30 19:01:24

标签: java h2o automl

我正在使用H2O 3.20.0.5训练一些模型。

我想通过在Java代码中使用AutoML来构建模型。现在,我可以导入和解析csv文件了。但是,当我尝试致电AutoML.startAutoML().get()时,总是得到一个j ava.lang.ArrayIndexOutOfBoundsException in ModelBuilder.java

似乎ModelBuilder中没有定义任何算法。但是我正在使用应该自动创建算法的AutoML。所以我很困惑。这是我的代码:

private static void trainByAutoML(Frame frame, String projectName) throws FileNotFoundException {

    AutoMLBuildSpec autoMLBuildSpec = new AutoMLBuildSpec();

    autoMLBuildSpec.input_spec.training_frame = frame._key;
    autoMLBuildSpec.input_spec.response_column = "level";
    autoMLBuildSpec.input_spec.sort_metric = "AUTO";

    autoMLBuildSpec.build_control.balance_classes = true;
    autoMLBuildSpec.build_control.class_sampling_factors = new float[2];
    autoMLBuildSpec.build_control.class_sampling_factors[0] = 1.0f;
    autoMLBuildSpec.build_control.class_sampling_factors[1] = 1.0f;
    autoMLBuildSpec.build_control.nfolds = nfolds;
    autoMLBuildSpec.build_control.keep_cross_validation_models = false;
    autoMLBuildSpec.build_control.keep_cross_validation_predictions = true;
    autoMLBuildSpec.build_control.project_name = projectName;
    HyperSpaceSearchCriteria.RandomDiscreteValueSearchCriteria randomDiscreteValueSearchCriteria = new HyperSpaceSearchCriteria.RandomDiscreteValueSearchCriteria();
    randomDiscreteValueSearchCriteria.set_max_runtime_secs(Double.parseDouble(autoModelRuntimeSeconds));
    randomDiscreteValueSearchCriteria.set_stopping_metric(ScoreKeeper.StoppingMetric.AUTO);
    randomDiscreteValueSearchCriteria.set_stopping_tolerance(0.0);
    autoMLBuildSpec.build_control.stopping_criteria = randomDiscreteValueSearchCriteria;

    AutoMLBuildSpec.AutoMLBuildModels autoMLBuildModels = new AutoMLBuildSpec.AutoMLBuildModels();
    autoMLBuildModels.exclude_algos = new AutoML.algo[4];
    autoMLBuildModels.exclude_algos[0] = AutoML.algo.DRF;
    autoMLBuildModels.exclude_algos[1] = AutoML.algo.GBM;
    autoMLBuildModels.exclude_algos[2] = AutoML.algo.GLM;
    autoMLBuildModels.exclude_algos[3] = AutoML.algo.StackedEnsemble;

    autoMLBuildSpec.build_models = autoMLBuildModels;

    AutoML aml = AutoML.makeAutoML(Key.make(), new Date(), autoMLBuildSpec);
    AutoML.startAutoML(aml);
    AutoML.startAutoML(autoMLBuildSpec).get();
    aml.leader().getMojo().writeTo(new FileOutputStream(new File("/home/aaa/model_data/1.zip")));
}

当代码运行到AutoML.startAutoML(autoMLBuildSpec).get();时,我总是会遇到该异常。这是此异常的堆栈跟踪:

08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR: java.lang.ArrayIndexOutOfBoundsException: -1
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at hex.ModelBuilder.algoName(ModelBuilder.java:106)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at ai.h2o.automl.AutoML.trainModel(AutoML.java:519)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at ai.h2o.automl.AutoML.defaultDeepLearning(AutoML.java:808)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at ai.h2o.automl.AutoML.learn(AutoML.java:1058)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at ai.h2o.automl.AutoML.run(AutoML.java:369)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at ai.h2o.automl.H2OJob$1.compute2(H2OJob.java:32)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at water.H2O$H2OCountedCompleter.compute(H2O.java:1267)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
08-31 02:38:53.583 127.0.0.1:54321       9246   FJ-1-15   ERRR:     at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

1 个答案:

答案 0 :(得分:0)

您能否分享详细信息,您如何开始H2O?看来您跳过了一些启动序列步骤,因为算法尚未注册。

在这种情况下,H2O日志将非常有用-它可以向我们显示缺少的内容。