我想使用带有5交叉验证的weka j48树。到目前为止,这是我的代码,
public class WekaJvMain {
public static void main(String[] args) {
try
{
CSV2Arff converter =new CSV2Arff();
converter.convert();
DataSource source = new DataSource("data.arff");
Instances train = source.getDataSet();
train.setClassIndex(train.numAttributes() - 1); // setting class attribute
// classifier
J48 j48 = new J48();
j48.setUnpruned(true); // using an unpruned J48
j48.buildClassifier(train);
System.out.print(j48.graph());
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
此代码训练数据并打印出j48树。但是我找不到如何设置交叉验证的折叠数量?请详细解释,我不擅长Java。
答案 0 :(得分:1)
这是您的代码增加了对您的j48分类器的5倍交叉验证评估。在训练最终分类器之前进行评估非常重要。可以找到其他信息here。
public class WekaJvMain {
public static void main(String[] args) {
try
{
CSV2Arff converter =new CSV2Arff();
converter.convert();
DataSource source = new DataSource("data.arff");
Instances train = source.getDataSet();
train.setClassIndex(train.numAttributes() - 1); // setting class attribute
// classifier
J48 j48 = new J48();
j48.setUnpruned(true); // using an unpruned J48
//evaluate j48 with cross validation
Evaluation eval=new Evaluation(train);
//first supply the classifier
//then the training data
//number of folds
//random seed
eval.crossValidateModel(j48, train, 5, new Random(1));
System.out.println("Percent correct: "+
Double.toString(eval.pctCorrect()));
j48.buildClassifier(train);
System.out.print(j48.graph());
}
catch(Exception e)
{
e.printStackTrace();
}
}
}