我想将我的类对象存储到spacy.Doc
中并用doc.to_disk
保存,如下所示:
from spacy.tokens import Doc
from spacy.vocab import Vocab
from dataclasses import dataclass
@dataclass
class Foo:
a: int
doc = Doc(Vocab(), [])
doc.user_data["foo"] = Foo(1)
doc.to_disk("/tmp/fooo")
但是此代码会引发错误:
TypeError: can not serialize 'Foo' object
我该怎么办?
答案 0 :(得分:1)
对于此线程here,您应该尝试以下解决方法:
def remove_unserializable_results(doc):
doc.user_data = {}
for x in dir(doc._):
if x in ['get', 'set', 'has']: continue
setattr(doc._, x, None)
for token in doc:
for x in dir(token._):
if x in ['get', 'set', 'has']: continue
setattr(token._, x, None)
return doc
nlp.add_pipe(remove_unserializable_results, last=True)