Question

像往常一样，我遍历for循环（Java）中带注释文档的句子：

for (CoreMap sentence : document.get(CoreAnnotations.SentencesAnnotation.class)) {
    ...
}

然后，在里面，我使用单词index从句子中移除一个单词（例如“老师”），我使用CoreLabel方法setWord()将新单词文本设置为“John”最后，我在同一索引中添加了句子中的更新词：

sentence.get(CoreAnnotations.TokensAnnotations.class).remove(token.get(CoreAnnotations.IndexAnnotation.class));
token.setWord("John");
sentence.get(CoreAnnotations.TokensAnnotation.class).add(token.get(CoreAnnotations.IndexAnnotation.class),token);

问题是句子保持不变。即使我在删除后立即打印句子文本，它也不会改变。难道我做错了什么？有更合理的方式吗？

Answer 1

即使你已经改变了这个词，我还是冒险尝试，你还没有改变originalText。一般来说，你应该对这些变换有点警惕 - 它们可以有各种奇怪的效果（例如，你的角色偏移会被打破），但如果你感到勇敢并且想要修复这个bug在手边，您应该能够通过设置来修复它：

token.setOriginalText("John");

如何使用斯坦福NLP替换句子（CoreMap）中的标记（CoreLabel）？

1 个答案: