在java代码中使用weka j48

时间:2015-03-11 10:23:47

标签: java weka

我想在java代码中使用 Weka 。我已经成功地为J48构建了模型并保存到驱动器中以进行测试。然而,分类器的classfiyinstance的输出本身就绑定了一个值。任何建议为什么会这样?

            build model function also works fine as it produces the evaluation summary which i have uploaded at the bottom
                public J48 buildModel() throws Exception {
                        J48 j48 = new J48();
                        j48.setMinNumObj(minNumObj);
                        j48.setConfidenceFactor(confidenceFactor);
                        j48.setNumFolds(numFolds);
                        j48.setUnpruned(unPruned);

                        fc.setClassifier(j48);
                        fc.buildClassifier(train);

                        Evaluation evaluation = new Evaluation(train);

                        evaluation.evaluateModel(fc, train);

                        System.out.println(evaluation.toSummaryString());
                        attribMap.modelResult = evaluation.toSummaryString();

                        return j48;
                    }

保存模型也可以正常工作,只需添加参考

                    public void saveModel(J48 j48) throws Exception {
                        String fileName = attribMap.path + "\\" + attribMap.fileName;
                        System.out.println("j48 path:" + fileName);
                        fileName = fileName + "tem.model";
                        weka.core.SerializationHelper.write(fileName, j48);

                    }

这是测试arff文件路径和模型文件路径发送到的模型函数

                    public String testModel(BufferedReader reader1, String modelPath, String classAtrib) {
                        double pred = 0.0;
                        try {
                            String result = "";
                            test = new Instances(reader1);
                            test.setClassIndex(test.numAttributes() - 1);
                            System.out.println("numattrib:" + test.numAttributes());
                            System.out.println("model path:" + modelPath);
                            J48 cls = (J48) weka.core.SerializationHelper.read("H:\\C-DAC\\autoGeneratedFiles\\tem.model");
                            System.out.println(cls.toSummaryString());
                            System.out.println("test num instances:" + test.numInstances());
                            System.out.println("b4 pred: " + test.instance(0));
                            Evaluation eTest = new Evaluation(test);
                            double[] d = eTest.evaluateModel(cls, test);
                            for (int i = 0; i < d.length; i++) {
                                System.out.println("val " + d[i]);
                            }

                            for (int i = 0; i < test.numInstances(); i++) {
                                System.out.println(", predicted: " + "pred value" + pred + "::::" + test.classAttribute().value((int) pred));

                            }
                            System.out.println("af pred:" + pred);
                            result = test.classAttribute().value((int) pred);
                            attribMap.modelTestResult = "Predicted value for " + classAtrib + " is " + result + "for the given inputs of ";
                            return result;

                        } catch (Exception e) {
                            e.printStackTrace();
                            System.out.println("PR`enter code here`EDDD : " + pred + " " + test.numInstances());

                        }
                        return null;
                    }

这里将生成输出但是pred变量总是粘在一个单独的值上

这是我试图测试的arff文件         培训数据
                @relation j48

            @attribute category_name {OBC,GEN,SC,ST}
            @attribute gender numeric
            @attribute paper numeric
            @attribute state_permanent numeric
            @attribute age numeric

            @data
            OBC,1,11,2,0.01
            GEN,2,11,35,0.01
            OBC,2,11,21,0.01
            OBC,2,5,32,0.01
            OBC,2,17,16,0.01
            GEN,2,5,34,0.01
            SC,2,8,21,0.01
            GEN,1,11,20,0.01
            OBC,1,5,12,0.01
            OBC,1,10,18,0.01
            GEN,2,10,2,0.01
            GEN,1,14,16,0.01
            OBC,2,17,17,0.01
            OBC,2,17,21,0.01
            OBC,2,17,32,0.01
            SC,1,8,32,0.01

此摘要适用于培训数据

        evaluation summary 
        Correctly Classified Instances         362               53.4712 %
        Incorrectly Classified Instances       315               46.5288 %
        Kappa statistic                          0.1393
        Mean absolute error                      0.2959
        Root mean squared error                  0.3846
        Relative absolute error                 93.3196 %
        Root relative squared error             96.654  %
        Coverage of cases (0.95 level)         100      %
        Mean rel. region size (0.95 level)      92.9838 %
        Total Number of Instances              677

这是给模型的测试数据,因此我使用test.numattributes-1作为我的最后一个属性是类属性

       @relation j48


            @attribute gender numeric
            @attribute paper numeric
            @attribute state_permanent numeric
            @attribute age numeric
      @attribute category_name {OBC,GEN,SC,ST}

            @data
            1,11,2,0.01,?

         This is test data where ? tends to be the value that model has to predict but the problem is model is predicting the same value for any given input

0 个答案:

没有答案