我有一个字符串数组
String strarr[] = {
"What a wonderful day",
"beautiful beds",
"food was awesome"
};
我也有训练有素的数据集
Room What a beautiful room
Room Wonderful sea-view
Room beds are comfortable
Room bed-spreads are good
Food The dinner was marvellous
Food Tasty foods
Service people are rude
Service waitors were not on time
Service service was horrible
以编程方式我无法获得我想要分类的字符串的分数和标签。 但是,如果我使用的是火车数据集,而且测试数据集中有两列,则可以使用。我的问题是,实际上,无法理解哪个标签属于我的数组中的每个字符串。
如何让分类器在数组上运行,而不是创建训练数据集?
我在尝试计算
时遇到错误ColumnDataClassifier cdc = new ColumnDataClassifier("examples/drogo.prop");
Classifier<String, String> cl
= cdc.makeClassifier(cdc.readTrainingExamples("examples/drogo.train"));
for (String li : strarr){
Datum<String, String> d = cdc.makeDatumFromLine(li);
System.out.println(li + " ==> " + cl.classOf(d) + " (score: " + cl.scoresOf(d) + ")");
}
错误:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at edu.stanford.nlp.classify.ColumnDataClassifier.makeDatum(ColumnDataClassifier.java:738)
at edu.stanford.nlp.classify.ColumnDataClassifier.makeDatumFromStrings(ColumnDataClassifier.java:275)
at edu.stanford.nlp.classify.ColumnDataClassifier.makeDatumFromLine(ColumnDataClassifier.java:245)
at alchemypoc.DrogoClassifier.main(DrogoClassifier.java:55)
Java Result: 1
答案 0 :(得分:0)
好的,所以我做了以下工作,现在看起来很有效。由于它是ColumnDataClassifier
并且它以某种方式预期了柱状数据,我在每个句子之前添加了一个标签。
String strarr[] = {
"\tWhat a wonderful day",
"\tbeautiful beds",
"\tfood was awesome"
};
它现在给了我价值。
What a wonderful day ==> Room (score: {Service=-0.6692784244930884, Room=1.4113604761865859, Food=-0.7420810715491954})
beautiful beds ==> Room (score: {Service=-2.1042147142001038, Room=3.888249805012589, Food=-1.7840358277259})
food was awesome ==> Food (score: {Service=-0.44203328206155995, Room=-0.9779506257026013, Food=1.4199861760769543})
如果有人,有不同的答案或更正确的方法,请发布您的答案。