NLTK Stanford POS Tagger比预期慢

时间:2016-09-30 21:55:51

标签: python nltk pos-tagger

Stanford POS Tagger文档(http://nlp.stanford.edu/software/pos-tagger-faq.shtml#h)声称标记器每秒可以执行15,000个单词。但是,我每秒大约要7个字。我正在使用english-left3words-distsim.tagger作为推荐的文档。难道我做错了什么?这是使用nltk库运行它的结果吗?

from nltk.tag import StanfordPOSTagger
jar = '/Users/marie/Desktop/StandfordParser/stanford-postagger-2015-12-09/stanford-postagger.jar'
model = '/Users/marie/Desktop/StandfordParser/stanford-postagger-2015-12-09/models/english-left3words-distsim.tagger'
tagger = StanfordPOSTagger(model, jar)

tokens = word_tokenize("What's the airspeed of an unladen swallow ?")

%timeit tagger.tag(tokens)

1 loop, best of 3: 1.01 s per loop

0 个答案:

没有答案