Question

基本上我想要的是用实际实体替换文本中的所有代词。

        // Path to the folder with models extracted from `stanford-corenlp-3.7.0-models.jar`
        var jarRoot = ...

        // Text for processing
        var text = "Kosgi Santosh sent an email to Stanford University. He didn't get a reply.";

        // Annotation pipeline configuration
        var props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
        props.setProperty("ner.useSUTime", "0");

        // We should change current directory, so StanfordCoreNLP could find all the model files automatically
        var curDir = Environment.CurrentDirectory;
        Directory.SetCurrentDirectory(jarRoot);
        var pipeline = new StanfordCoreNLP(props);
        Directory.SetCurrentDirectory(curDir);

        // Annotation
        var annotation = new Annotation(text);
        pipeline.annotate(annotation);

        var graph = annotation.get(new CorefChainAnnotation().getClass());
        Console.WriteLine(graph);

到目前为止，我只能找到如何＆＃34;漂亮的印刷＆＃34;它，但我想进一步处理＆＃34; graph＆＃34;的结果，但我不知道如何从＆＃34; annotation.get（new CorefChainAnnotation（）。getClass（））实际解析结果＆＃34 ;.在Java中，据说它将返回Map＆lt;整数，CorefChain＆gt;，但我不知道它应该如何在C＃中工作。

你有什么想法吗？

Answer 1

获得注释后，您可以通过投射获得图表。

Map graph = (Map)document.get(new CorefCoreAnnotations.CorefChainAnnotation().getClass());
var entrySetValues = graph.entrySet();
Iterator it = entrySetValues.iterator();
while (it.hasNext())
{
     Map.Entry kvpair = (Map.Entry)it.next();
     CorefChain corefChain = (CorefChain)kvpair.getValue();
     var mentionsList = corefChain.getMentionsInTextualOrder() as ArrayList;
     foreach (CorefMention mention in mentionsList)
     {
          string noun = mention.mentionSpan;
          // do other stuff
     } 

     it.remove();
}

对于C＃，我们的想法是首先正确地转换对象，从转换对象中获取列表作为ArrayList，在arraylist上循环并再次正确地转换对象。

使用.NET中的Stanford Parser解析Coreference

1 个答案: