我有以下代码,我尝试从文件中提取名词单词:
import nltk
from nltk import word_tokenize
import re
from nltk.corpus import stopwords
fr = open('input.txt','r+')
fw = open('output.txt','a+')
for line in fr:
line = line.lower() #converting from upper to lower case
lines = ''.join([i for i in line if not i.isdigit()]) #removing numericals
for word in lines.split():
word=re.sub(r'[^a-zA-Z0-9]', ' ',word)
if word not in stopwords.words('english'):
tokenized_word=word_tokenize(word)
tokenized_word=nltk.pos_tag(tokenized_word)
fw.write(str(tokenized_word))
fw.write('\n')
fr.close()
fw.close()
fw = open('noun_output.txt','w+')
with open('output.txt','r+') as fr:
for line in fr:
word = [word for word,pos in line if pos =='NN']
print word
当我运行此代码时,我收到以下错误:
Traceback (most recent call last):
File "1.py", line 29, in <module>
word = [word for word,pos in line if pos =='NN']
ValueError: need more than 1 value to unpack
我对这个词做了一些试验(比如单词[0] [1]并且也分开了这个词)但是无法解决这个问题..请帮助!