我有一个包含10000条记录的大型数据集,这样5000个属于1级,剩下5000个属于-1级。我使用随机森林并获得了超过90%的良好准确度。
现在,如果我有一个arff文件
@relation cds_orf
@attribute start numeric
@attribute end numeric
@attribute score numeric
@attribute orf_coverage numeric
@attribute class {1,-1}
@data
(suppose this contains 5 records)
我的输出应该是这样的
No Actual_class Predicted class
1 1 1
2 1 1
3 -1 -1
4 1 -1
5 1 1
我希望Java代码打印此输出。谢谢。 (注意:我使用了classifier.classifyInstance()但是它给出了NullPointerException)
答案 0 :(得分:3)
经过大量研究后,我自己找到了答案。以下代码执行相同操作并将输出写入anther文件orf_out。
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.PrintWriter;
import java.util.Random;
import weka.classifiers.Evaluation;
import weka.classifiers.trees.RandomForest;
import weka.core.Instances;
/**
*
* @author samy
*/
public class WekaTest {
/**
* @throws java.lang.Exception
*/
public static void rfnew() throws Exception {
BufferedReader br;
int numFolds = 10;
br = new BufferedReader(new FileReader("orf_arff"));
Instances trainData = new Instances(br);
trainData.setClassIndex(trainData.numAttributes() - 1);
br.close();
RandomForest rf = new RandomForest();
rf.setNumTrees(100);
Evaluation evaluation = new Evaluation(trainData);
evaluation.crossValidateModel(rf, trainData, numFolds, new Random(1));
rf.buildClassifier(trainData);
PrintWriter out = new PrintWriter("orf_out");
out.println("No.\tTrue\tPredicted");
for (int i = 0; i < trainData.numInstances(); i++)
{
String trueClassLabel;
trueClassLabel = trainData.instance(i).toString(trainData.classIndex());
// Discreet prediction
double predictionIndex =
rf.classifyInstance(trainData.instance(i));
// Get the predicted class label from the predictionIndex.
String predictedClassLabel;
predictedClassLabel = trainData.classAttribute().value((int) predictionIndex);
out.println((i+1)+"\t"+trueClassLabel+"\t"+predictedClassLabel);
}
out.println(evaluation.toSummaryString("\nResults\n======\n", true));
out.println(evaluation.toClassDetailsString());
out.println("Results For Class -1- ");
out.println("Precision= " + evaluation.precision(0));
out.println("Recall= " + evaluation.recall(0));
out.println("F-measure= " + evaluation.fMeasure(0));
out.println("Results For Class -2- ");
out.println("Precision= " + evaluation.precision(1));
out.println("Recall= " + evaluation.recall(1));
out.println("F-measure= " + evaluation.fMeasure(1));
out.close();
}
}
我需要在我的代码中使用buildClassifier。