如何提取与CoreEntityMention(WikiDictAnnotator)匹配的Wikipedia实体

时间:2019-07-17 09:55:35

标签: java stanford-nlp

我正在一些文本上运行CoreNLP,并将找到的实体与Wikipedia实体匹配。我想重建句子,为找到的实体提供链接和其他有用的信息。

CoreEntityMention有一个entity()方法,但它只返回一个字符串。

Properties props = new Properties();
props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,entitylink");

// set up pipeline
pipeline = new StanfordCoreNLP(props);
String doc = "text goes here";
pipeline.annotate(doc);

// Iterate the sentences
for (CoreSentence sentence : doc.sentences()) {
      Go through all mentions
      for (CoreEntityMention em : sentence.entityMentions()) {
          System.out.println(em.sentence());
          // Here I would like to extract the Wikipedia entity information
          System.out.println(em.entity());
      }
    }

1 个答案:

答案 0 :(得分:0)

您只需要添加维基百科页面网址即可。

所以Neil_Armstrong映射到https://en.wikipedia.org/wiki/Neil_Armstrong