Weka:如何在代码中使用交叉验证

时间:2016-03-12 02:49:57

标签: classification weka evaluation cross-validation

我试图在以下代码中使用交叉验证: 的程序:

#! /bin/sh
# main script
# code here
. ./sub_script  # run commands from sub_script
# more code here

我有一些交叉验证代码,但无法与上面的代码集成。请建议如何在上面的代码中集成以下代码以查找交叉验证?

代码:

   TextDirectoryToArff d = new TextDirectoryToArff();

      try {
    Instances dataset = d.createDataset("C:\\mytest");
    dataset.setClassIndex(dataset.numAttributes() - 1 );

    double precision = 0, recall=0,fmeasure=0,error=0;

    int size1 = dataset1.numInstances() / 10;

    int begin = 0;
    int end = size1 - 1 ;

    for (int i=1 ; i<=10;i++)
    {
        System.out.println("iteration :" + 1);

        Instances training = new Instances(dataset);
        Instances testing = new Instances(dataset, begin , (end - begin));

        for (int j=0;j < (end - begin); j++)
            training.delete(begin);

        Classifier tree = new NaiveBayes();

        Instances filteredInstaces = training;
        StringToNominal nominal ;

        for(int a=0;a<training.numAttributes()-1;a++)
        {
            if(training.attribute(a).isString())
            {
                nominal = new StringToNominal();

                nominal.setInputFormat(filteredInstaces);
                training = Filter.useFilter(training, nominal);
            }
        }

        tree.buildClassifier(training);

        Evaluation eval = new Evaluation(testing);

        eval.evaluateModel(tree, testing);
        System.out.println("Precision:" + eval.precision(1));
        System.out.println("Recall:" + eval.recall(1));
        System.out.println("Fmeasure:" + eval.fMeasure(1));
        System.out.println("Error:" + eval.errorRate());

1 个答案:

答案 0 :(得分:0)

我认为你对weka Evaluation.crossValidateModel方法感到困惑。它已经计算了10种不同的列车和测试折叠,并在列车上训练10个模型并在测试中评估模型,因此没有必要像在代码中尝试那样计算它。

所以在你的代码中: TextDirectoryToArff d = new TextDirectoryToArff();

  try {
Instances dataset = d.createDataset("C:\\mytest");
dataset.setClassIndex(dataset.numAttributes() - 1 );

//您已有数据集

Classifier naiveBayes = new NaiveBayes();
//you need a classifier
Evaluation eval = new Evaluation(dataset);
//only call the crossValidateModel with your classifier, on your dataset, with 10 fold, and random
eval.crossValidateModel(naiveBayes, dataset, 10, new Random(1));
//print the results of 10 fold
System.out.println(classifier);
System.out.println(eval.toSummaryString());
System.out.println(eval.toMatrixString());
System.out.println(eval.toClassDetailsString());

您可以在https://weka.wikispaces.com/Generating+cross-validation+folds+(Java+approach)

找到更多信息