如何使用weka在java中的文档分类中传递动态测试实例

时间:2017-07-05 19:30:51

标签: java classification weka naivebayes

我是weka的新手。目前我正在使用weka和java进行文本分类。我的训练数据集有一个String属性和一个class属性。

@RELATION test

@ATTRIBUTE tweet string
@ATTRIBUTE class {positive,negative}

我想动态创建一个测试瞬间,并使用Naive-Bayes分类器对其进行分类。

   public static void main(String[] args) throws FileNotFoundException, IOException, Exception {

    StringToWordVector filter = new StringToWordVector();

    //training set
    BufferedReader reader = null;
    reader = new BufferedReader(new FileReader("D:/suicideTest.arff"));

    Instances train = new Instances(reader);
    train.setClassIndex(train.numAttributes() -1);
    filter.setInputFormat(train);
    train = Filter.useFilter(train, filter);

    reader.close();



    Attribute tweet = new Attribute("tweet");
    FastVector classVal = new FastVector(2);
    classVal.addElement("positive");
    classVal.addElement("negative");


    FastVector testAttributes = new FastVector(2);
    testAttributes.addElement(tweet);
    testAttributes.addElement(classVal);

    Instance testcase;
    testcase = null;

    testcase.setValue(tweet,"Hello my world");
    testcase.setValue((Attribute)testAttributes.elementAt(1),"?");

    Instances test = null;

    test.add(testcase);

    test = Filter.useFilter(test, filter);

    NaiveBayes nb = new NaiveBayes();
    nb.buildClassifier(train);

    Evaluation eval = new Evaluation(train);
    eval.crossValidateModel(nb, train, 10,new Random(1));


    double pred = nb.classifyInstance(test.instance(0));

    System.out.println("the result is   "+ pred);

}

我已按照上一个问题How to test a single test case in Weka, entered by a User?

但是当我尝试将值设置为测试实例时,我仍然得到java.lang.NullPointerException,

testcase.setValue(推文,“Hello my world”);

1 个答案:

答案 0 :(得分:-1)

此代码工作正常。 可以创建实例,

Instances testSet = new Instances("", allAtt, 1);
double pred = nb.classifyInstance(testSet.instance(0));

并将一个实例传递给分类器

public static void main(String[] args) throws Exception{
         StringToWordVector filter = new StringToWordVector();

        //training set
        BufferedReader reader;
        reader = new BufferedReader(new FileReader("D:/test.arff"));

        Instances train = new Instances(reader);
        train.setClassIndex(train.numAttributes() -1);
        filter.setInputFormat(train);
        train = Filter.useFilter(train, filter);


        reader.close();

        NaiveBayes nb = new NaiveBayes();
        nb.buildClassifier(train);


        ArrayList cls = new ArrayList(2);

        cls.add("negative"); 
        cls.add("positive");


        Attribute clsAtt = new Attribute("class", cls);

        //ArrayList<String> tweet = new ArrayList(1);
        //String tweet = "";
        //Attribute tweetAtt = new Attribute("tweet", tweet);

        ArrayList allAtt = new ArrayList(2); 
        //allAtt.add(tweetAtt);
        allAtt.add(new Attribute("tweet", (FastVector) null));
        allAtt.add(clsAtt);


        // Create an empty test set
         Instances testSet = new Instances("", allAtt, 1);
        // Set class index
        testSet.setClassIndex(testSet.numAttributes() - 1);

        String names=  "I want to suiceide";
        Instance inst = new DenseInstance(2); 
        inst.setValue((Attribute)allAtt.get(0), names.toString());

        testSet.add(inst);
        System.out.println(testSet.instance(0).toString());
        double pred = nb.classifyInstance(testSet.instance(0));

        filter.setInputFormat(testSet);
        testSet = Filter.useFilter(testSet, filter);

        String predictString = testSet.classAttribute().value((int) pred);




    }