我正在使用cPickle
序列化用于记录的数据。
我希望能够将任何我想要的东西扔进一个对象,然后序列化它。通常这对cPickle
来说很好,但是遇到了一个问题,我要序列化的对象之一包含一个函数。这导致cPickle
引发异常。
我宁愿cPickle
只是跳过它无法处理的东西而不是导致整个过程崩溃。
实现这一目标的好方法是什么?
答案 0 :(得分:2)
我假设您正在寻找尽力而为的解决方案,如果未打开的结果无法正常运行,您就可以了。
对于您的特定用例,您可能希望register a pickle handler获取功能对象。只需将它设为一个虚拟处理程序,它就足以满足您的最佳目的。为函数创建一个处理程序是可能的,这是相当棘手的。为避免影响其他pickle代码,您可能希望在退出日志记录代码时取消注册处理程序。
这是一个例子(没有任何注销):
import cPickle
import copy_reg
from types import FunctionType
# data to pickle: note that o['x'] is a lambda and they
# aren't natively picklable (at this time)
o = {'x': lambda x: x, 'y': 1}
# shows that o is not natively picklable (because of
# o['x'])
try:
cPickle.dumps(o)
except TypeError:
print "not natively picklable"
else:
print "was pickled natively"
# create a mechanisms to turn unpickable functions int
# stub objects (the string "STUB" in this case)
def stub_pickler(obj):
return stub_unpickler, ()
def stub_unpickler():
return "STUB"
copy_reg.pickle(
FunctionType,
stub_pickler, stub_unpickler)
# shows that o is now picklable but o['x'] is restored
# to the stub object instead of its original lambda
print cPickle.loads(cPickle.dumps(o))
打印:
not natively picklable
{'y': 1, 'x': 'STUB'}
答案 1 :(得分:0)
或者,尝试cloudpickle
:
>>> import cloudpickle
>>> squared = lambda x: x ** 2
>>> pickled_lambda = cloudpickle.dumps(squared)
>>> import pickle
>>> new_squared = pickle.loads(pickled_lambda)
>>> new_squared(2)
4
pip install cloudpickle
并实现你的梦想。 dask,IPython parallel和PySpark也有同样的梦想。