我正在尝试建立一个外语频率词典/词汇学习者。
我希望程序能够:
Counter()
执行此操作)Counter()
保存到pickle文件中,这样我每次运行程序时都不必处理该书most_common()
功能轻松完成)问题是,一旦我处理了一本书并将其保存到pickle文件,我就无法再次访问它了。执行此操作的函数会加载一个空字典,即使在检查pickle文件时,我也可以看到它确实有数据。
此外,如果我手动加载pickle文件(使用pickle.load()
)并手动拉出第N个最常用的字(手动使用most_common()
而不是加载pickle的自定义函数并拉出第N个最常见的词)它会完美运作。
我怀疑加载pickle文件的自定义函数有问题,但我无法弄清楚它是什么。
以下是代码:
import string
import collections
import pickle
freq_dict = collections.Counter()
dfn_dict = dict()
def save_dict(name, filename):
pickle.dump(name, open('{0}.p'.format(filename), 'wb'))
#Might be a problem with this
def load_dict(name, filename):
name = pickle.load(open('{0}.p'.format(filename), 'rb'))
def cleanedup(fh):
for line in fh:
word = ''
for character in line:
if character in string.ascii_letters:
word += character
else:
yield word
word = ''
#Opens a foreign language textfile and adds all unique
#words in it, to a Counter, ordered by frequency
def process_book(textname):
with open (textname) as doc:
freq_dict.update(cleanedup(doc))
save_dict(freq_dict, 'svd_f_dict')
#Shows the Nth most frequent word in the frequency dict
def show_Nth_word(N):
load_dict(freq_dict, 'svd_f_dict')
return freq_dict.most_common()[N]
#Shows the first N most frequent words in the freq. dictionary
def show_N_freq_words(N):
load_dict(freq_dict, 'svd_f_dict')
return freq_dict.most_common(N)
#Presents a word to the user, allows user to define it
#adds the word and its definition to another dictionary
#which is used to store only the word and its definition
def define_word(word):
load_dict(freq_dict, 'svd_f_dict')
load_dict(dfn_dict, 'svd_d_dict')
if word in freq_dict:
definition = (input('Please define ' + str(word) + ':'))
dfn_dict[word] = definition
else:
return print('Word not in dictionary!')
save_dict(dfn_dict, 'svd_d_dict')
这是尝试使用两种方法(手动和函数)拉出第N个常用词:
from dictionary import *
import pickle
#Manual, works
freq_dict = pickle.load(open('svd_f_dict.p', 'rb'))
print(freq_dict.most_common()[2])
#Using a function defined in the other file, doesn't work
word = show_Nth_word(2)
感谢您的帮助!
答案 0 :(得分:3)
您的load_dict函数将unpickling的结果存储到本地变量“name'”中。这不会修改您作为参数传递给函数的对象。
相反,您需要从load_dict()函数返回调用pickle.load()的结果:
def load_dict(filename):
return pickle.load(open('{0}.p'.format(filename), 'rb'))
然后将其分配给您的变量:
freq_dict = load_dict('svd_f_dict')