Question

我正在尝试运行堆栈溢出中提供的示例here。

我在这里再次复制了代码：

from sklearn.feature_extraction.text import TfidfVectorizer
text_files = ['file1.txt', 'file2.txt']
documents = [open(f) for f in text_files]
tfidf = TfidfVectorizer().fit_transform(documents)
# no need to normalize, since Vectorizer will return normalized tf-idf
pairwise_similarity = tfidf * tfidf.T

我添加的唯一内容就是这一行：

text_files = ['file1.txt', 'file2.txt']

当我运行代码时出现此错误：

File "C:\Python33\lib\site-packages\sklearn\feature_extraction\text.py", line 195, in <lambda>
return lambda x: strip_accents(x.lower())
AttributeError: '_io.TextIOWrapper' object has no attribute 'lower'

file1.txt和file2.txt是输入文本文件。我使用了text_files的错误格式吗？这个错误的原因是什么，我该如何解决？我真的很感激你的帮助。

Answer 1

open(f)是一个_io.TextIOWrapper对象，这就是它失败的原因。

尝试更改

documents = [open(f) for f in text_files]

到

documents = [open(f).read() for f in text_files]

AttributeError：'_ io.TextIOWrapper'对象没有属性'lower'

1 个答案: