我试图运行一小段使用Truecase Python程序包的代码(用于还原文本中的大写字母),并且我收到一个UnpicklingError
,因为它从nltk加载了一个标记化器。
由于我的网络上的服务器限制,我无法使用nltk.download
安装丢失的nltk文件,因此我将nltk_data目录直接下载到了我的计算机上。看起来它能够找到该文件,但是打开语言泡菜文件时遇到了麻烦。
import truecase
truecase.get_true_case('hey, what is the weather in new york?')
In [4]: runfile('/Users/{me}/Downloads/truecase-0.0.4/testingtruecase.py', wdir='/Users/{me}/Downloads/truecase-0.0.4')
Reloaded modules: truecase, truecase.TrueCaser
Traceback (most recent call last):
File "<ipython-input-4-82ea1175dde8>", line 1, in <module>
runfile('/Users/{me}/Downloads/truecase-0.0.4/testingtruecase.py', wdir='/Users/{me}/Downloads/truecase-0.0.4')
File "/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 704, in runfile
execfile(filename, namespace)
File "/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/Users/{me}/Downloads/truecase-0.0.4/testingtruecase.py", line 10, in <module>
truecase.get_true_case('hey, what is the weather in new york?')
File "/Users/y99b/Downloads/truecase-0.0.4/truecase/__init__.py", line 7, in get_true_case
return caser.get_true_case(sentence, out_of_vocabulary_token_option=out_of_vocabulary_token_option)
File "/Users/{me}/Downloads/truecase-0.0.4/truecase/TrueCaser.py", line 80, in get_true_case
tokens = nltk.word_tokenize(sentence)
File "/anaconda3/lib/python3.7/site-packages/nltk/tokenize/__init__.py", line 143, in word_tokenize
sentences = [text] if preserve_line else sent_tokenize(text, language)
File "/anaconda3/lib/python3.7/site-packages/nltk/tokenize/__init__.py", line 104, in sent_tokenize
tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
File "/anaconda3/lib/python3.7/site-packages/nltk/data.py", line 873, in load
resource_val = pickle.load(opened_resource)
UnpicklingError: invalid load key, 'v'.
答案 0 :(得分:1)
发现了问题,我正在使用的泡菜文件无效,并且其中实际上没有正确的数据(它在实际文件中只有指向github的链接)。我发现下载了正确的english.pickle文件,并且一切正常。如果有人有无效的键“ v”错误,则很可能与您的实际泡菜文件有关。
答案 1 :(得分:0)
您的腌制文件可能已损坏。用新文件替换它。它对我有用。