Question

我想知道是否有办法使用Naive Bayes训练模型，然后将其应用于单个记录。我是weka的新手，所以我不知道这是否可行。还有，有办法将分类器输出存储在文件中吗？

Answer 1

答案是肯定的，因为朴素贝叶斯是一个基于简单概率贝叶斯定理的模型，可用于分类挑战。

对于使用朴素贝叶斯和其他分类器的分类，您需要首先使用样本数据集训练模型，一旦经过训练，模型可以应用于任何记录。

使用此方法时，总会出现错误概率，但这主要取决于样本的质量和数据集的属性。

我没有直接使用Weka，但作为Rapid Miner的扩展，但原则必须适用。一旦模型被训练，您应该能够看到/打印模型参数。

Answer 2

我正在寻找相同的答案，同时使用java。

我创建了一个arff文件，其中包含训练日期，并以程序http://weka.wikispaces.com/file/view/WekaDemo.java为例训练和评估分类。

我仍然需要弄清楚，如何在java中保存和加载模型，以及（更重要的是）如何针对单个记录进行测试。

<强> WekaDemo.java

 ...
 public void execute() throws Exception {
    // run filter
    m_Filter.setInputFormat(m_Training);
    Instances filtered = Filter.useFilter(m_Training, m_Filter);

    // train classifier on complete file for tree
    m_Classifier.buildClassifier(filtered);

    // 10fold CV with seed=1
    m_Evaluation = new Evaluation(filtered);
    m_Evaluation.crossValidateModel(
        m_Classifier, filtered, 10, m_Training.getRandomNumberGenerator(1));
    //TODO Save model
    //TODO Load model
    //TODO Test against a single information
  }
  ...

修改1：

此处说明了保存和加载模型：How to test existing model with new instance in weka, using java code?

Answer 3

在http://weka.wikispaces.com/Use+WEKA+in+your+Java+code#Classification-Classifying%20instances中，可以快速地对单个实例进行分类。

//load model (saved from user interface)
Classifier tree = (Classifier) weka.core.SerializationHelper.read("/some/where/j48.model");

// load unlabeled data
Instances unlabeled = new Instances( new BufferedReader(new FileReader("/some/where/unlabeled.arff")));

// set class attribute
unlabeled.setClassIndex(unlabeled.numAttributes() - 1);
// create copy
Instances labeled = new Instances(unlabeled);

// label instances
for (int i = 0; i < unlabeled.numInstances(); i++) {
  double clsLabel = tree.classifyInstance(unlabeled.instance(i));
  labeled.instance(i).setClassValue(clsLabel);
  System.out.println(clsLabel + " -> " + unlabeled.classAttribute().value((int) clsLabel));
  double[] dist =tree.distributionForInstance(unlabeled.instance(i))
  for(int j=0; j<dist.length;j++){
    System.print(unlabeled.classAttribute().value(j)+": " +dist[j]);
  }
}

编辑此方法不会训练，评估和保存模型。这是我通常使用weka gui做的事情。（http://weka.wikispaces.com/Serialization）此方法在示例中使用具有标称类的树型模型，但应该很容易将其转换为朴素贝叶斯示例。

使用NaiveBayes分类器对Weka中的一个实例进行分类

3 个答案: