StanfordNLP Openie失败了

时间:2016-06-12 00:26:53

标签: nlp stanford-nlp

我已经启动并运行了StanfordNLP。

我的maven依赖结构如下:

<dependency>
    <groupId>edu.stanford.nlp</groupId>
    <artifactId>stanford-corenlp</artifactId>
    <version>3.6.0</version>
</dependency>
<dependency>
    <groupId>edu.stanford.nlp</groupId>
    <artifactId>stanford-corenlp</artifactId>
    <version>3.6.0</version>
    <classifier>models</classifier>
</dependency>

我的代码运行得很好,如下所示:

@Test
public void testTA() throws Exception
{

    Path p = Paths.get("s.txt");

    byte[] encoded = Files.readAllBytes(p);
    String s = new String(encoded);

    Properties props = new Properties();
    props.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, ner, dcoref");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

    // read some text in the text variable
    String text = s;

    StringBuffer sb = new StringBuffer();

    sb.append(text);
    sb.append(
            "\n\n\n\n\n\n\n===================================================================\n\n\n\n\n\n\n\n\n\n\n");

    // create an empty Annotation just with the given text
    Annotation document = new Annotation(text);

    // run all Annotators on this text
    pipeline.annotate(document);

    // these are all the sentences in this document
    // a CoreMap is essentially a Map that uses class objects as keys and
    // has values with custom types
    List<CoreMap> sentences = document.get(SentencesAnnotation.class);

    sb.append(
            "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n+++++++++++++++++++++++SENTENCES++++++++++++++++++++++++++++\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n");
    for (CoreMap sentence : sentences)
    {
        // traversing the words in the current sentence
        // a CoreLabel is a CoreMap with additional token-specific methods
        sb.append("\n\n\n==============SENTENCE==============\n\n\n");
        sb.append(sentence.toString());
        sb.append("\n");
        for (CoreLabel token : sentence.get(TokensAnnotation.class))
        {
            // this is the text of the token
            sb.append("\n==============TOKEN==============\n");
            String word = token.get(TextAnnotation.class);
            sb.append(word);
            sb.append(" : ");
            // this is the POS tag of the token
            String pos = token.get(PartOfSpeechAnnotation.class);
            // this is the NER label of the token
            sb.append(pos);
            sb.append(" : ");
            String lemma = token.get(LemmaAnnotation.class);
            sb.append(lemma);
            sb.append(" : ");
            String ne = token.get(NamedEntityTagAnnotation.class);
            sb.append(ne);
            sb.append("\n");

        }

        // this is the parse tree of the current sentence
        Tree tree = sentence.get(TreeAnnotation.class);
        sb.append("\n\n\n=====================TREE==================\n\n\n");
        sb.append(tree.toString());

        // this is the Stanford dependency graph of the current sentence
        SemanticGraph dependencies = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class);
        sb.append("\n\n\n");
        sb.append(dependencies.toString());
    }

但是,当我将openie添加到管道时,代码会失败。

props.setProperty(“annotators”,“tokenize,ssplit,pos,lemma,parse,ner,dcoref,openie”);

我得到的错误如下:

注释器“openie”需要注释器“natlog”

有人可以就此提出建议吗?

1 个答案:

答案 0 :(得分:1)

答案是管道中的注释器可以相互依赖。 只需将natlog添加到管道中即可。 至关重要的是,必须首先添加依赖项,所以

  • natlog必须在openie之前进入管道。
  • depparse必须在natlog
  • 之前的管道中

另外,

  • 解析必须在dcoref之前的管道中。