Question

说我有namedtuple这样：

FooTuple = namedtuple("FooTuple", "item1, item2")

我希望以下函数用于散列：

foo_hash(self):
    return hash(self.item1) * (self.item2)

我想要这个，因为我希望item1和item2的顺序不相关（我将对比较运算符执行相同的操作）。我想到了两种方法来做到这一点。第一个是：

FooTuple.__hash__ = foo_hash

这有效，但感觉被黑了。所以我尝试了子类化FooTuple：

class EnhancedFooTuple(FooTuple):
    def __init__(self, item1, item2):
        FooTuple.__init__(self, item1, item2)

    # custom hash function here

但后来我明白了：

DeprecationWarning: object.__init__() takes no parameters

那么，我该怎么办？或者这完全是一个坏主意，我应该从头开始编写我自己的课程？

Answer 1

我认为您的代码存在问题（我的猜测是您创建了一个具有相同名称的元组实例，因此fooTuple现在是一个元组，而不是一个元组类），因为对子类进行了子类化像这样的元组应该工作。无论如何，您不需要重新定义构造函数。您只需添加哈希函数：

In [1]: from collections import namedtuple

In [2]: Foo = namedtuple('Foo', ['item1', 'item2'], verbose=False)

In [3]: class ExtendedFoo(Foo):
   ...:     def __hash__(self):
   ...:         return hash(self.item1) * hash(self.item2)
   ...: 

In [4]: foo = ExtendedFoo(1, 2)

In [5]: hash(foo)
Out[5]: 2

Answer 2

从Python 3.6.1开始，可以使用typing.NamedTuple类（只要您对类型提示没问题）就可以更干净地实现这一点：

from typing import NamedTuple, Any


class FooTuple(NamedTuple):
    item1: Any
    item2: Any

    def __hash__(self):
        return hash(self.item1) * hash(self.item2)

Answer 3

带有自定义namedtuple函数的__hash__对于将immutable data models存储到dict和set

中很有用

例如：

class Point(namedtuple('Point', ['label', 'lat', 'lng'])):
    def __eq__(self, other):
        return self.label == other.label

    def __hash__(self):
        return hash(self.label)

    def __str__(self):
        return ", ".join([str(self.lat), str(self.lng)])

同时覆盖__eq__和__hash__允许将业务分组到set中，确保每个业务线在集合中都是唯一的：

walgreens = Point(label='Drugstore', lat = 37.78735890, lng = -122.40822700)
mcdonalds = Point(label='Restaurant', lat = 37.78735890, lng = -122.40822700)
pizza_hut = Point(label='Restaurant', lat = 37.78735881, lng = -122.40822713)

businesses = [walgreens, mcdonalds, pizza_hut]
businesses_by_line = set(businesses)

assert len(business) == 3
assert len(businesses_by_line) == 2

使用自定义哈希函数创建一个namedtuple

3 个答案: