我正在使用字符串向量过滤器将我的arff转换为矢量格式。
但它抛出异常
weka.core.WekaException: weka.classifiers.bayes.NaiveBayesMultinomialUpdateable: Not enough training instances with class labels (required: 1, provided: 0)!
我尝试在weka explorer上使用相同的功能,但它运行良好。
这是我的代码
ArffLoader loader = new ArffLoader();
loader.setFile(new File("valid file"));
Instances structure = loader.getStructure();
structure.setClassIndex(0);
// train NaiveBayes
NaiveBayesMultinomialUpdateable n = new NaiveBayesMultinomialUpdateable();
FilteredClassifier f = new FilteredClassifier();
StringToWordVector s = new StringToWordVector();
f.setFilter(s);
f.setClassifier(n);
f.buildClassifier(structure);
Instance current;
while ((current = loader.getNextInstance(structure)) != null)
n.updateClassifier(current);
// output generated model
System.out.println(n);
我尝试了另一个例子,但它仍然不起作用
ArffLoader loader = new ArffLoader();
loader.setFile(new File("valid file"));
Instances structure = loader.getStructure();
// train NaiveBayes
NaiveBayesMultinomialUpdateable n = new NaiveBayesMultinomialUpdateable();
FilteredClassifier f = new FilteredClassifier();
StringToWordVector s = new StringToWordVector();
s.setInputFormat(structure);
Instances struct = Filter.useFilter(structure, s);
struct.setClassIndex(0);
System.out.println(struct.numAttributes()); // only gives 2 or 1 attributes
n.buildClassifier(struct);
Instance current;
while ((current = loader.getNextInstance(struct)) != null)
n.updateClassifier(current);
// output generated model
System.out.println(n);
打印的属性数始终为2或1.
字符串向量字符串似乎没有按预期工作
原始文件夹:https://www.dropbox.com/sh/cma4hbe2r96ul1c/GL2wNdeVUz
转换为arff:https://www.dropbox.com/s/efle6ci4lb5riq7/test1.arff
答案 0 :(得分:1)
根据你的arff,这个类似乎是两个属性中的第二个,所以问题可以在这里:
struct.setClassIndex(0);
试
struct.setClassIndex(1);
更新:我对第一个示例进行了此更改,它没有任何异常,并打印出来:
The independent probability of a class
--------------------------------------
oil spill 40.0
police 989.0
The probability of a word given the class
-----------------------------------------
oil spill police
class Infinity Infinity