我尝试过nltk.org第7章。特别是在这里:http://www.nltk.org/book/ch07.html在3.2节下面有一个ConsecutiveNPChunker
类。我试图复制代码。但是,它始终抛出以下ValueError
。
我的代码如下:
import nltk
from nltk.corpus import conll2000
train_sents = conll2000.chunked_sents('train.txt', chunk_types=['NP'])
class ConsecutiveNPChunker(nltk.ChunkParserI): # [_consec-chunker]
def __init__(self, train_sents):
tagged_sents = [[((w,t),c) for (w,t,c) in
nltk.chunk.tree2conlltags(sent)]
for sent in train_sents]
self.tagger = ConsecutiveNPChunkTagger(tagged_sents)
def parse(self, sentence):
tagged_sents = self.tagger.tag(sentence)
conlltags = [(w,t,c) for ((w,t),c) in tagged_sents]
return nltk.chunk.conlltags2tree(conlltags)
def npchunk_features(sentence, i, history):
word, pos = sentence[i]
return {"pos": pos}
chunker = ConsecutiveNPChunker(train_sents)
以下是我运行程序时的错误:
~/.pyenv/versions/3.4.3/envs/nlp/lib/python3.4/site-packages/nltk/tag/util.py in <listcomp>(.0)
67
68 """
---> 69 return [w for (w, t) in tagged_sentence]
70
71
ValueError: need more than 1 value to unpack
答案 0 :(得分:0)
你有一个解包错误,这是因为你没有zip
方法,它需要n个迭代次数并返回元组列表。
所以,在def parse()
方法/函数的代码中,
conlltags = [(w,t,c) for ((w,t),c) in tagged_sents]
这应该是
conlltags = [(w,t,c) for ((w,t),c) in zip(tagged_sents)]
这将产生多个解包值。