Question

我正在使用python的内置库nltk来获取stanford ner tagger api设置，但我看到这个api的标签和stanford的ner tagger网站上的在线演示之间的不一致。有些单词在网上演示中被标记，而它们不是在python中使用api，类似地，某些单词被标记为不同。我使用了与网站中提到的相同的分类器。任何人都可以告诉我为什么会出现问题以及它的解决方案是什么......？

Answer 1

我遇到了同样的问题，并确定我的代码和在线演示正在为文本应用不同的格式规则。

https://github.com/dat/pyner/blob/master/ner/client.py

for s in ('\f', '\n', '\r', '\t', '\v'): #strip whitespaces
            text = text.replace(s, '')
        text += '\n' #ensure end-of-line

nltk stanford ner tagger和stanford ner tagger在线演示之间的不一致

1 个答案: