如何在python3中取消使用__setitem__验证的'dict'的子类?

时间:2014-01-15 17:56:33

标签: python python-3.x pickle python-3.3

我正在使用python3.3。这个问题可能在2.x的pickle协议中不存在,但我实际上没有验证过。

假设我创建了一个dict子类,每次更新一个键时都会计数。像这样:

class Foo(dict):
    def __init__(self):
        self.counter = 0

    def __setitem__(self, key, value):
        print(key, value, self.__dict__)
        if key == 'bar':
            self.counter += 1
        super(Foo, self).__setitem__(key, value)

您可以这样使用它:

>>> f = Foo()
>>> assert f.counter == 0
>>> f['bar'] = 'baz'
... logging output...        
>>> assert f.counter == 1

现在让我们挑剔和解开它:

>>> import pickle
>>> f_str = pickle.dumps(f)
>>> f_new = pickle.loads(f_str)
bar baz {}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "test.py", line 133, in __setitem__
    self.counter += 1
AttributeError: 'Foo' object has no attribute 'counter'

我认为print()中的__setitem__显示了问题:pickle.loads尝试在写入字典的键之前写入对象的属性...至少我认为这就是发生的事情。您可以很容易地验证是否删除了self.counter中的Foo.__setitem__()引用:

>>> f_mod = ModifiedFoo()
>>> f_mod['bar'] = 'baz'
>>> f_mod_str = pickle.dumps(f_mod)
>>> f_mod_new = pickle.loads(f_mod_str)
bar baz {}
>>> assert f_mod_new.counter == 0
>>>

这只是泡菜协议的副产品吗?我已尝试使用__setstate__上的变体来正确地解开它,但据我所知,它甚至在调用__setitem__之前就会遇到__setstate__错误。有什么方法可以修改这个对象以允许去除斑纹吗?

3 个答案:

答案 0 :(得分:4)

pickle文件所述:

  

当pickle类实例被unpickled时,它的__init__()方法是   通常被调用。

在您的情况下,您执行想要调用__init__。但是,由于您的类是新式类,因此无法使用__getinitargs__(无论如何都不支持python3)。您可以尝试编写自定义__getstate____setstate__方法:

class Foo(dict):
    def __init__(self):
        self.counter = 0
    def __getstate__(self):
        return (self.counter, dict(self))
    def __setstate__(self, state):
        self.counter, data = state
        self.update(data)  # will *not* call __setitem__

    def __setitem__(self, key, value):
        self.counter += 1
        super(Foo, self).__setitem__(key, value)

但是这个仍然不起作用,因为你是子类dictdict有一个特殊的pickle处理程序,__getstate__方法 被调用,但__setstate__方法不是

您可以解决此问题,定义__reduce__方法:

class Foo(dict):
    def __init__(self):
        self.counter = 0
    def __getstate__(self):
        return (self.counter, dict(self))
    def __setstate__(self, state):
        self.counter, data = state
        self.update(data)
    def __reduce__(self):
        return (Foo, (), self.__getstate__())

    def __setitem__(self, key, value):
        self.counter += 1
        super(Foo, self).__setitem__(key, value)

答案 1 :(得分:2)

您可以通过添加__reduce__()方法向您的字典子类添加pickle支持,该方法将用于获取参数以传递给用户定义的函数,以便在对象进行unpickled时重新构建该对象。

虽然,因为你的类是dict子类,所以实现并不像我原先想象的那么简单,但是一旦我弄清楚需要做什么,它就相当简单了。这是我想出的 - 请注意_Foo_unpickle_helper()函数不能是Foo的常规,类或静态方法,所以这就是它在模块级别定义的原因:

class Foo(dict):
    def __init__(self):
        self.counter = 0

    def __setitem__(self, key, value):
        print(key, value, self.__dict__)
        if key == 'bar':
            self.counter += 1
        super(Foo, self).__setitem__(key, value)

    def __reduce__(self):
        return _Foo_unpickle_helper, (self.counter, iter(self.items()))

def _Foo_unpickle_helper(counter, items):
    """ Reconstitute a Foo instance from the arguments. """
    foo = Foo()
    foo.counter = counter
    foo.update(items)  # apparently doesn't call __setitem__()...
    return foo

f = Foo()
f['bar'] = 'baz'
f['bar'] = 'baz'
print('f: {}'.format(f))
print('f.counter: {}'.format(f.counter))

import pickle
f_str = pickle.dumps(f)
print('----------')
f_new = pickle.loads(f_str)
print('f_new: {}'.format(f_new))
print('f_new.counter: {}'.format(f_new.counter))

输出:

bar baz {'counter': 0}
bar baz {'counter': 1}
f: {'bar': 'baz'}
f.counter: 2
----------
f_new: {'bar': 'baz'}
f_new.counter: 2

答案 2 :(得分:2)

您正在继承dict,并且pickle协议将使用专用的dict处理程序将键和值存储在生成的pickle数据中,使用不同的集合操作码将这些恢复到您的对象。

因此,在{/ em>恢复字典键后,__setstate__才会被称为,并且状态仅包含counter属性。

这里有两种解决方法:

  1. counter未被调用的情况下,使__init__代码具有弹性:

    class Foo(dict):
        counter = 0
    
        def __setitem__(self, key, value):
            print(key, value, self.__dict__)
            if key == 'bar':
                self.counter += 1
            super(Foo, self).__setitem__(key, value)
    

    此处counter是一个类属性,因此始终存在。你也可以使用:

    self.counter = getattr(self, 'counter', 0) + 1
    

    确保缺少属性的默认值。

  2. 提供__newargs__方法;它可以返回一个空元组,但是指定它可以确保在unpickling时调用__new__,这反过来可以调用__init__

    class Foo(dict):
        def __new__(cls, *args, **kw):
            f = super().__new__(cls, *args, **kw)
            f.__init__()
            return f
    
        def __init__(self):
            self.counter = 0
    
        def __setitem__(self, key, value):
            print(key, value, self.__dict__)
            if key == 'bar':
                self.counter += 1
            super(Foo, self).__setitem__(key, value)
    
        def __getnewargs__(self):
            # Call __new__ (and thus __init__) on unpickling.
            return ()
    

    请注意,调用__init__后,unpickler仍会设置所有键,然后还原__dict__self.counter最终将反映正确的值。

  3. 演示:

    第一种方法:

    >>> import pickle
    >>> class Foo(dict):
    ...     counter = 0
    ...     def __setitem__(self, key, value):
    ...         print(key, value, self.__dict__)
    ...         if key == 'bar':
    ...             self.counter += 1
    ...         super(Foo, self).__setitem__(key, value)
    ... 
    >>> f = Foo()
    >>> f['bar'] = 'baz'
    bar baz {}
    >>> f.counter
    1
    >>> f['bar'] = 'foo'
    bar foo {'counter': 1}
    >>> f.counter
    2
    >>> f_str = pickle.dumps(f)
    >>> new_f = pickle.loads(f_str)
    bar foo {}
    >>> new_f.counter
    2
    >>> new_f.items()
    dict_items([('bar', 'foo')])
    

    第二种方法:

    >>> import pickle
    >>> class Foo(dict):
    ...     def __new__(cls, *args, **kw):
    ...         f = super().__new__(cls, *args, **kw)
    ...         f.__init__()
    ...         return f
    ...     def __init__(self):
    ...         self.counter = 0
    ...     def __setitem__(self, key, value):
    ...         print(key, value, self.__dict__)
    ...         if key == 'bar':
    ...             self.counter += 1
    ...         super(Foo, self).__setitem__(key, value)
    ...     def __getnewargs__(self):
    ...         return ()
    ... 
    
    >>> f = Foo()
    >>> f['bar'] = 'baz'
    bar baz {'counter': 0}
    >>> f.counter
    1
    >>> f['bar'] = 'foo'
    bar foo {'counter': 1}
    >>> f.counter
    2
    >>> f_str = pickle.dumps(f)
    >>> new_f = pickle.loads(f_str)
    bar foo {}
    >>> new_f.counter
    2
    >>> new_f.items()
    dict_items([('bar', 'foo')])