我想通过pickle dump commond保存nlp.doc(由spacy nlp模型生成的文档),但是出现类型错误。您能告诉我如何解决此错误,或者告诉我另一种保存nlp文档的方法。
import spacy
import pickle
nlp = spacy.load('en')
a="Hello Adam, how are you?"
data=nlp(a)
f = open('nlp1/test.pkl',mode='wb')
pickle.dump(data,f)
------------------------------------------------------------------
---------
TypeError Traceback (most recent
call last)
<ipython-input-27-3fd206f7756b> in <module>()
5 data=nlp(a)
6 f = open('nlp1/test.pkl',mode='wb')
7 pickle.dump(data,f)
doc.pyx in spacy.tokens.doc.pickle_doc()
doc.pyx in spacy.tokens.doc.Doc.to_bytes()
~/anaconda3/lib/python3.6/site-packages/spacy/util.py in
to_bytes(getters, exclude)
if key not in exclude:
serialized[key] = getter()
return msgpack.dumps(serialized, use_bin_type=True,
encoding='utf8')
~/anaconda3/lib/python3.6/site-packages/msgpack_numpy.py in
packb(o, **kwargs)
194 """
return Packer(**kwargs).pack(o)
def unpack(stream, **kwargs):
TypeError: __init__() got an unexpected keyword argument
'encoding'
答案 0 :(得分:0)
对我来说,错误指向spacy的util.py中的486行。 我从.dumps()中删除了编码参数,它起作用了。可能不是一个长期的解决方案,而是一个不错的临时技巧。
File "/user/anaconda/envs/lib/python3.7/site-packages/spacy/util.py", line 486, in to_bytes
return msgpack.dumps(serialized, use_bin_type=True, encoding='utf8')