weka java加载模型并使用测试数据集

时间:2013-05-29 11:49:46

标签: java data-mining weka

我尝试按照weka wikii中的说明进行序列化和反序列化来构建weka模型。使用训练中的bayesnet构建并希望加载该模型进行测试。训练和测试具有相同的属性 Filter的设置如下所示:

    Remove rm = generateFilter(filterOption);

    FilteredClassifier fc = new FilteredClassifier();
    fc.setFilter(rm);
    filterClassifier.setClassifier(randomTree);
    filterClassifier.buildClassifier(data);
    exportClassifier("randomTree", file, filterClassifier);

导出代码如下所示:

    private void exportClassifier(String method, String file,
        FilteredClassifier filterClassifier) throws IOException,
        FileNotFoundException {
    System.out.println(file + "." + method + ".model");

    ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(
            file + "." + method + ".model"));
    oos.writeObject(filterClassifier);
    oos.flush();
    oos.close();
}

但是当我尝试用另一个测试集加载它们时:

    public String EvaluateModel(String file, File modelFile) throws Exception {
    Instances data = populateInstance(file);

    if (data.classIndex() == -1) {
        System.out.println("reset index...");
        data.setClassIndex(data.numAttributes() - 1);
    }

    FilteredClassifier classifier = (FilteredClassifier) weka.core.SerializationHelper
            .read(new FileInputStream(modelFile));

    //classifier.buildClassifier(data);
    Evaluation eval = new Evaluation(data);
    //eval.crossValidateModel(classifier, data, 10, new Random(1));
    eval.evaluateModel(classifier, data);

    String summaryString = eval
            .toSummaryString("\nResults\n======\n", false);

    System.out.println(summaryString);
    System.out.println(eval.fMeasure(1) + " " + eval.precision(1) + " "
            + eval.recall(1));
    return formatOutput(eval);
}

我有例外:

    Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1200
at weka.classifiers.bayes.net.estimate.DiscreteEstimatorBayes.getProbability(DiscreteEstimatorBayes.java:106)
at weka.classifiers.bayes.net.estimate.SimpleEstimator.distributionForInstance(SimpleEstimator.java:183)
at weka.classifiers.bayes.BayesNet.distributionForInstance(BayesNet.java:386)
at weka.classifiers.meta.FilteredClassifier.distributionForInstance(FilteredClassifier.java:437)
at weka.classifiers.Evaluation.evaluateModelOnceAndRecordPrediction(Evaluation.java:1439)
at weka.classifiers.Evaluation.evaluateModel(Evaluation.java:1407)
at com.besmart.raynor.dataprocessing.dataprocessor.weka.WekaRunner.EvaluateModel(WekaRunner.java:138)
at com.besmart.raynor.dataprocessing.dataprocessor.weka.WekaBatchRunner.batchReEvaluation(WekaBatchRunner.java:80)
at com.besmart.raynor.dataprocessing.dataprocessor.weka.WekaBatchRunner.main(WekaBatchRunner.java:103)

1 个答案:

答案 0 :(得分:1)

您可以使用weka.core.SerializationHelper.write方法,而不是使用ObjectOutputStream编写对象。