您好我在NLTK3中尝试此代码: - 不知怎的,我设法修复了第6行与NLTK的第3版一起工作。但stil for循环根本不返回任何东西。
import nltk
sample = """ some random text content with names and countries etc"""
sentences = nltk.sent_tokenize(sample)
tokenized_sentences = [nltk.word_tokenize(sentence) for sentence in sentences]
tagged_sentences = [nltk.pos_tag(sentence) for sentence in tokenized_sentences]
chunked_sentences=nltk.chunk.ne_chunk_sents(tagged_sentences) #Managed to fix this to work with version_3
for i in chunked_sentences:
if hasattr(i,'label'):
if i.label()=='NE':
print i
另外,如果我尝试调试,我会看到这个输出:
for i in chunked_sentences:
if hasattr(i,'label') and i.label:
print i.label
S
S
S
S
S
S
S
S
然后如何检查“NE”。我觉得NLTK-3有问题,我真的无法弄明白。请帮忙
答案 0 :(得分:3)
似乎你正在迭代句子。我假设您想迭代句子中包含的各个节点。
它应该像这样工作:
for sentence in chunked_sentences:
for token in sentence:
if hasattr(token,'label') and token.label() == 'NE':
print token
编辑:为了将来的参考,让我了解你正在迭代句子的事实只是一个句子的根节点通常被标记为' S'。