Stanford POS Tagger文档(http://nlp.stanford.edu/software/pos-tagger-faq.shtml#h)声称标记器每秒可以执行15,000个单词。但是,我每秒大约要7个字。我正在使用english-left3words-distsim.tagger作为推荐的文档。难道我做错了什么?这是使用nltk库运行它的结果吗?
from nltk.tag import StanfordPOSTagger
jar = '/Users/marie/Desktop/StandfordParser/stanford-postagger-2015-12-09/stanford-postagger.jar'
model = '/Users/marie/Desktop/StandfordParser/stanford-postagger-2015-12-09/models/english-left3words-distsim.tagger'
tagger = StanfordPOSTagger(model, jar)
tokens = word_tokenize("What's the airspeed of an unladen swallow ?")
%timeit tagger.tag(tokens)
1 loop, best of 3: 1.01 s per loop