我有一个情绪分析的任务。我有推文(标记为负面或正面)作为训练数据。我使用StringToWordVector和NaiveBayesMultinomial创建了一个模型。
代码:
try{
TextDirectoryLoader loader = new TextDirectoryLoader();
loader.setDirectory(new File("./train/"));
Instances dataRaw = loader.getDataSet();
System.out.println(loader.getStructure());
StringToWordVector filter = new StringToWordVector();
filter.setInputFormat(dataRaw);
Instances dataFiltered = Filter.useFilter(dataRaw, filter);
System.out.println("\n\nFiltered data:\n\n" + dataFiltered);
// train Multinomial NaiveBayes classifier and output model
NaiveBayesMultinomial classifier = new NaiveBayesMultinomial();
classifier.buildClassifier(dataFiltered);
//System.out.println("\n\nClassifier model:\n\n" + classifier);
//save the model
weka.core.SerializationHelper.write("./model/naviebayesmodel/", classifier);
}catch(Exception ex){
ex.printStackTrace();
}
现在我想在新推特上测试这个模型。我无法计算出分类器的测试部分。我尝试了以下代码,但没有捕获任何实例。 如何使用现有模型测试新推文?
代码:
try{
Classifier cls = (Classifier) weka.core.SerializationHelper.read("./model/naviebayesmodel");
//Instances ins = (Instances)weka.core.SerializationHelper.read("./model/naviebayesmodel");
//System.out.println(ins);
//i.s
TextDirectoryLoader loader = new TextDirectoryLoader();
loader.setDirectory(new File("./test/-1/"));
Instances dataRaw = loader.getDataSet();
//String data = "hello, I am your test case. This is a great clasifier :) !!";
StringToWordVector filter = new StringToWordVector();
filter.setInputFormat(dataRaw);
//Instances unlabeled = new Instances(new BufferedReader(new FileReader("./test/test.txt")));
Instances dataFiltered = Filter.useFilter(dataRaw, filter);
dataRaw.setClassIndex(dataRaw.numAttributes() - 1);
//Instances dataFiltered = Filter.useFilter(unlabeled, filter);
for (int i = 0; i < dataRaw.numInstances(); i++) {
double clsLabel = cls.classifyInstance(dataRaw.instance(i));
System.out.println(clsLabel);
}
//System.out.println(dataRaw.numInstances());
}catch(Exception ex){
ex.printStackTrace();
}