我已经启动并运行了StanfordNLP。
我的maven依赖结构如下:
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.6.0</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.6.0</version>
<classifier>models</classifier>
</dependency>
我的代码运行得很好,如下所示:
@Test
public void testTA() throws Exception
{
Path p = Paths.get("s.txt");
byte[] encoded = Files.readAllBytes(p);
String s = new String(encoded);
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, ner, dcoref");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
// read some text in the text variable
String text = s;
StringBuffer sb = new StringBuffer();
sb.append(text);
sb.append(
"\n\n\n\n\n\n\n===================================================================\n\n\n\n\n\n\n\n\n\n\n");
// create an empty Annotation just with the given text
Annotation document = new Annotation(text);
// run all Annotators on this text
pipeline.annotate(document);
// these are all the sentences in this document
// a CoreMap is essentially a Map that uses class objects as keys and
// has values with custom types
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
sb.append(
"\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n+++++++++++++++++++++++SENTENCES++++++++++++++++++++++++++++\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n");
for (CoreMap sentence : sentences)
{
// traversing the words in the current sentence
// a CoreLabel is a CoreMap with additional token-specific methods
sb.append("\n\n\n==============SENTENCE==============\n\n\n");
sb.append(sentence.toString());
sb.append("\n");
for (CoreLabel token : sentence.get(TokensAnnotation.class))
{
// this is the text of the token
sb.append("\n==============TOKEN==============\n");
String word = token.get(TextAnnotation.class);
sb.append(word);
sb.append(" : ");
// this is the POS tag of the token
String pos = token.get(PartOfSpeechAnnotation.class);
// this is the NER label of the token
sb.append(pos);
sb.append(" : ");
String lemma = token.get(LemmaAnnotation.class);
sb.append(lemma);
sb.append(" : ");
String ne = token.get(NamedEntityTagAnnotation.class);
sb.append(ne);
sb.append("\n");
}
// this is the parse tree of the current sentence
Tree tree = sentence.get(TreeAnnotation.class);
sb.append("\n\n\n=====================TREE==================\n\n\n");
sb.append(tree.toString());
// this is the Stanford dependency graph of the current sentence
SemanticGraph dependencies = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class);
sb.append("\n\n\n");
sb.append(dependencies.toString());
}
但是,当我将openie添加到管道时,代码会失败。
props.setProperty(“annotators”,“tokenize,ssplit,pos,lemma,parse,ner,dcoref,openie”);
我得到的错误如下:
注释器“openie”需要注释器“natlog”
有人可以就此提出建议吗?
答案 0 :(得分:1)
答案是管道中的注释器可以相互依赖。 只需将natlog添加到管道中即可。 至关重要的是,必须首先添加依赖项,所以
另外,