如何在WEKA中进行交叉验证后打印预测类

时间:2011-09-06 03:40:25

标签: java validation machine-learning weka decision-tree

使用分类器完成10倍交叉验证后,如何打印出每个实例的预测类以及这些实例的分布?

J48 j48 = new J48();
Evaluation eval = new Evaluation(newData);
eval.crossValidateModel(j48, newData, 10, new Random(1));

当我尝试类似下面的内容时,它说分类器没有构建

for (int i=0; i<data.numInstances(); i++){
   System.out.println(j48.distributionForInstance(newData.instance(i)));
 }

我正在尝试的功能与WEKA GUI中的功能相同,其中一旦训练了分类器,我就可以点击Visualize classifier error" > Save,我会在文件中找到预测的类。但是现在我需要它来使用我自己的Java代码。


我尝试了类似下面的内容:

J48 j48 = new J48();
Evaluation eval = new Evaluation(newData);
StringBuffer forPredictionsPrinting = new StringBuffer();
weka.core.Range attsToOutput = null;
Boolean outputDistribution = new Boolean(true);
eval.crossValidateModel(j48, newData, 10, new Random(1), forPredictionsPrinting, attsToOutput, outputDistribution);

然而它却提示我错误:

Exception in thread "main" java.lang.ClassCastException: java.lang.StringBuffer cannot be cast to weka.classifiers.evaluation.output.prediction.AbstractOutput

3 个答案:

答案 0 :(得分:3)

crossValidateModel()方法可以采用forPredictionsPrinting varargsweka.classifiers.evaluation.output.prediction.AbstractOutput实例参数。

其中重要的部分是StringBuffer来保存所有预测的字符串表示。以下代码未经测试JRuby,但您应该可以根据需要进行转换。

j48 = j48.new
eval = Evalution.new(newData)
predictions = java.lange.StringBuffer.new
eval.crossValidateModel(j48, newData, 10, Random.new(1), predictions, Range.new('1'), true)
# variable predictions now hold a string of all the individual predictions

答案 1 :(得分:0)

前几天我被困住了。我想在matlab中使用矩阵而不是从arf​​f文件加载来评估Weka分类器。我使用http://www.mathworks.com/matlabcentral/fileexchange/21204-matlab-weka-interface和以下源代码。我希望这可以帮助别人。

import weka.classifiers.*;

import java.util.*

wekaClassifier = javaObject('weka.classifiers.trees.J48');

wekaClassifier.buildClassifier(processed);%Loaded from loadARFF

e = javaObject('weka.classifiers.Evaluation',processed);%Loaded from loadARFF
myrand = Random(1);
plainText = javaObject('weka.classifiers.evaluation.output.prediction.PlainText');
buffer = javaObject('java.lang.StringBuffer');
plainText.setBuffer(buffer)
bool = javaObject('java.lang.Boolean',true);
range = javaObject('weka.core.Range','1');
array = javaArray('java.lang.Object',3);
array(1) = plainText;
array(2) = range;
array(3) = bool;
e.crossValidateModel(wekaClassifier,testing,10,myrand,array)
e.toClassDetailsString

AsdrúbalLópez-Chau

答案 2 :(得分:0)

clc
clear
%Load from disk
fileDataset = 'cm1.arff';
myPath = 'C:\Users\Asdrubal\Google Drive\Respaldo\DoctoradoALCPC\Doctorado ALC PC\AlcMobile\AvTh\MyPapers\Papers2014\UnderOverSampling\data\Skewed\datasetsKeel\';
javaaddpath('C:\Users\Asdrubal\Google Drive\Respaldo\DoctoradoALCPC\Doctorado ALC PC\AlcMobile\JarsForExperiments\weka.jar');
wekaOBJ = loadARFF([myPath fileDataset]);
%Transform from data into Matlab
[data, featureNames, targetNDX, stringVals, relationName] = ... 
weka2matlab(wekaOBJ,'[]');
%Create testing and training sets in matlab format (this can be improved)
[tam, dim] = size(data);
idx = randperm(tam);
testIdx = idx(1 : tam*0.3);
trainIdx = idx(tam*0.3 + 1:end);
trainSet = data(trainIdx,:);
testSet = data(testIdx,:);
%Trasnform the training and the testing sets into the Weka format
testingWeka = matlab2weka('testing', featureNames, testSet);
trainingWeka = matlab2weka('training', featureNames, trainSet);
%Now evaluate classifier
import weka.classifiers.*;
import java.util.*
wekaClassifier = javaObject('weka.classifiers.trees.J48');
wekaClassifier.buildClassifier(trainingWeka);
e = javaObject('weka.classifiers.Evaluation',trainingWeka);
myrand = Random(1);
plainText = javaObject('weka.classifiers.evaluation.output.prediction.PlainText');
buffer = javaObject('java.lang.StringBuffer');
plainText.setBuffer(buffer)
bool = javaObject('java.lang.Boolean',true);
range = javaObject('weka.core.Range','1');
array = javaArray('java.lang.Object',3);
array(1) = plainText;
array(2) = range;
array(3) = bool;
e.crossValidateModel(wekaClassifier,testingWeka,10,myrand,array)%U
e.toClassDetailsString