我正在使用python字典来比较莎士比亚的全部作品和一个10,000字的字典,代码应该将10,000字字典中找不到的所有单词输出到一个名为'SpellChecker.txt'的单独文件中。我相信此代码中的所有内容都正常运行。我只遇到一个错误,将数据保存到输出文件,似乎无法修复它。任何帮助表示赞赏。
错误:
Traceback (most recent call last):
File "/Users/JakeFrench/Desktop/HashTable.py", line 29, in <module>
f1.write(word+'\n', encoding= 'utf-8')
import re
import time
start_time = time.time()
f1=open ('SpellChecker.txt', 'w+')
Dictionary = {}
Document = []
with open ('10kWords.txt', encoding= 'utf-8') as f:
for word in f:
Dictionary[word.rstrip()] = 1
with open ('ShakespeareFullWorks.txt', encoding= 'utf-8') as f:
content = f.read().split(" ")
content = [item.lower() for item in content]
content = ' '.join(content)
content = re.findall("\w+", content)
for line in content:
Document.append(line)
for line in content:
for word in line.split():
if word.lower() not in Dictionary:
f1.write(word+'\n', encoding= 'utf-8')
f1.close()
print ("--- %s seconds ---" % (time.time() - start_time))
答案 0 :(得分:2)
只需从write
方法中删除编码属性,然后将其插入open
函数,如下所示:
f1=open ('SpellChecker.txt', 'w+', encoding='utf-8')
...
f1.write(word+'\n')