Question

给定一个自定义的新式python类实例，有什么方法可以散列它并从中获取一个类似ID的唯一值以用于各种目的？想想给定类实例的md5sum或sha1sum。

我目前使用的方法是通过hexdigest来修改类，并将结果哈希字符串存储到类属性中（此属性永远不会成为pickle / unpickle过程的一部分，fyi）。除了现在我遇到了第三方模块使用嵌套类的情况，并且没有真正好的方法来腌制那些没有一些黑客的东西。我认为我错过了一些聪明的小Python技巧来实现这个目标。

修改

示例代码，因为这里似乎需要对问题进行任何牵引。可以初始化以下类，并且可以正确设置self._uniq_id属性。

#!/usr/bin/env python

import hashlib

# cPickle or pickle.
try:
   import cPickle as pickle
except:
   import pickle
# END try

# Single class, pickles fine.
class FooBar(object):
    __slots__ = ("_foo", "_bar", "_uniq_id")

    def __init__(self, eth=None, ts=None, pkt=None):
        self._foo = "bar"
        self._bar = "bar"
        self._uniq_id = hashlib.sha1(pickle.dumps(self, -1)).hexdigest()[0:16]

    def __getstate__(self):
        return {'foo':self._foo, 'bar':self._bar}

    def __setstate__(self, state):
        self._foo = state['foo']
        self._bar = state['bar']
        self._uniq_id = hashlib.sha1(pickle.dumps(self, -1)).hexdigest()[0:16]

    def _get_foo(self): return self._foo
    def _get_bar(self): return self._bar
    def _get_uniq_id(self): return self._uniq_id

    foo = property(_get_foo)
    bar = property(_get_bar)
    uniq_id = property(_get_uniq_id)
# End

由于Bar嵌套在Foo中，因此无法初始化下一课：

#!/usr/bin/env python

import hashlib

# cPickle or pickle.
try:
   import cPickle as pickle
except:
   import pickle
# END try

# Nested class, can't pickle for hexdigest.
class Foo(object):
    __slots__ = ("_foo", "_bar", "_uniq_id")

    class Bar(object):
        pass

    def __init__(self, eth=None, ts=None, pkt=None):
        self._foo = "bar"
        self._bar = self.Bar()
        self._uniq_id = hashlib.sha1(pickle.dumps(self, -1)).hexdigest()[0:16]

    def __getstate__(self):
        return {'foo':self._foo, 'bar':self._bar}

    def __setstate__(self, state):
        self._foo = state['foo']
        self._bar = state['bar']
        self._uniq_id = hashlib.sha1(pickle.dumps(self, -1)).hexdigest()[0:16]

    def _get_foo(self): return self._foo
    def _get_bar(self): return self._bar
    def _get_uniq_id(self): return self._uniq_id

    foo = property(_get_foo)
    bar = property(_get_bar)
    uniq_id = property(_get_uniq_id)
# End

我收到的错误是：

Traceback (most recent call last):
  File "./nest_test.py", line 70, in <module>
    foobar2 = Foo()
  File "./nest_test.py", line 49, in __init__
    self._uniq_id = hashlib.sha1(pickle.dumps(self, -1)).hexdigest()[0:16]
cPickle.PicklingError: Can't pickle <class '__main__.Bar'>: attribute lookup __main__.Bar failed

（nest_test.py）中有两个类，因此行号偏移。）

酸洗需要我发现的__getstate__()方法，所以我也实现了__setstate__()的完整性。但鉴于已经存在的关于安全和泡菜的警告，必须有更好的方法来做到这一点。

根据我到目前为止所读到的内容，错误源于Python无法解析嵌套类。它尝试查找不存在的属性__main__.Bar。它确实需要能够找到__main__.Foo.Bar，但没有真正好的方法来做到这一点。我在这里碰到另一个SO答案提供了一个欺骗Python的“黑客”，但是它提出了一个严厉的警告，即这种方法不可取，并且要么使用除了酸洗之外的东西，要么将嵌套的类定义移到外面与内部相对。

然而，我认为，SO答案的原始问题是对文件进行酸洗和取消。我只需要pickle以便使用必需的hashlib函数，这些函数似乎在bytearray上运行（就像我在.NET中习惯的那样），并且pickling（尤其是cPickle）很快，优化与编写自己的bytearray例程。

Answer 1

这取决于完全 ID应该具有哪些属性。

例如，只要id(foo)在内存中处于活动状态，您就可以使用foo获取保证唯一的ID，或者如果所有内容都可以使用repr(instance.__dict__)字段具有合理的repr值。

你需要具备什么特色？

Answer 2

当你正在使用泡菜的十六进制时，你会发现它听起来像id实际上并不需要与对象相关，它只需要是唯一的。为什么不简单地使用uuid模块，特别是uuid.uuid4来生成唯一ID并将它们分配给对象中的uuid字段......

哈希一个python新式类实例？

2 个答案: