我创建文档词汇表并从句子中提取单词。 我希望将每个句子与生成的单词列表进行比较。 但是它显示了这些错误:
def generate_bow(messages):
vocab = tokenize(messages)
print("Word List for Document \n{0} \n".format(vocab));
for sentence in messages:
words = word_extraction(sentence)
bag_vector = numpy.zeros(len(vocab))
for w in words:
for i,word in enumerate(vocab):
if word == w:
bag_vector[i] += 1
print("{0}\n{1}\n".format(sentence,numpy.array(bag_vector)))
NameError
Traceback (most recent call last)
<ipython-input-37-34430e8c4ee8> in <module>()
4 for sentence in messages:
5 words = word_extraction(sentence)
----> 6 bag_vector = numpy.zeros(len(vocab))
7 for w in words:
8 for i,word in enumerate(vocab):
NameError: name 'vocab' is not defined
我已经导入了“ Numpy”,还尝试添加dtype=float
,仍然存在相同的问题。