所以,我正在用nltk3学习nlp,并且在练习其中一个例子时,我在计算句子中的命名实体时遇到困难。显然,nltk已更新,并且.node已从树结构中删除。这是我的代码:
Workers Level Selected village
0 10 Small Aagar
4 84 Medium Dhokari
7 127 Large Takali
8 122 Large Gardhani
9 120 Large Pi.Khand
执行时我收到错误:
import sys
f=open('nyt.txt','r')
news_content=f.read()
import nltk
results=[]
for sent_no,sent in enumerate(nltk.sent_tokenize(news_content)):
tokens=nltk.word_tokenize(sent)
no_of_tokens=len(tokens)
tagged=nltk.pos_tag(tokens)
nouns=len([word for word,pos in tagged if pos in ["NN","NNP"]])
ners=nltk.ne_chunk(tagged,binary=True)
no_of_ners=len([chunk for chunk in ners if hasattr(chunk,'node')])
score=(nouns+no_of_ners)/float(no_of_tokens)
results.append((sent_no,no_of_tokens,no_of_ners,nouns,score,sent))
results.sort(key=lambda x:x[4])
print(results[5])
我需要访问命名实体并对它们进行计数。有人可以帮忙吗?