Question

我正在尝试编写一个脚本，将POS标签分配给我的语料库，然后挑选出与POS标签存在歧义的词语。到目前为止，我有这个，但它带来了操作错误，对我来说看起来并不正确。有任何想法吗？我是初学者。：）

cfd = nltk.ConditionalFreqDist( 
            ((x[1], y[1], z[0]), z[1])
            for sent in corpus.words()
            for x, y, z in nltk.trigrams(sent))
ambigue_context = [c for c in cfd.conditions() if len(cfd[c]) > 1]
sum(cfd[c].N() for c in ambigue_context) / cfd.N()

这是我得到的错误：

〜/ UA / Assignments / opdracht.py:78:(这是发送的行   在corpus.words（）

/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/nltk/probability.py:1747：

〜/ UA / Assignments / opdracht.py:79:DedexError：字符串索引超出范围

（最后一行是nltk.trigrams（已发送）中x，y，z的行）

POS标签中的NLTK含糊不清

0 个答案: