Question

我需要Python中的包/多类数据类型。我理解collections.Counter经常用于此目的。但比较运营商似乎不起作用：

In [1]: from collections import Counter

In [2]: bag1 = Counter(a=1, b=2, c=3)

In [3]: bag2 = Counter(a=2, b=2)

In [4]: bag1 > bag2
Out[4]: True

这对我来说似乎是个错误。我期望小于和大于运算符执行类似集合的子集和超集比较。但如果是这种情况，那么bag1 > bag2将是错误的，因为bag2包含额外的'a'。在Counter对象上似乎也不是子集/超集方法。所以我有两个问题：

Counter对象使用什么比较逻辑？
如何比较子对象，超集，正确子集和正确超集的计数器对象？

Answer 1

在Python 2上，比较回溯到default sort order for dictionaries（Counter是dict的子类）。

映射（字典）比较相等，当且仅当它们已排序（键，值）列表比较相等。 [5]平等以外的结果是一致地解决，但没有另外定义。 [6]

在Python 3上，the comparison raises a TypeError：

映射（字典）比较相等，当且仅当它们具有相同（键，值）对。订单比较（'＆lt;'，'＆lt; ='，'＆gt; ='，'＆gt;'）提高TypeError。

Answer 2

这个悬而未决的问题很有意思：

如何比较子对象，超集，正确子集和正确超集的计数器对象？

通过定义缺少的“丰富的比较方法”。您也可以使用自由函数，这将使客户端代码更加明确。

from collections import Counter

class PartiallyOrderedCounter(Counter):

    def __le__(self, other):
        """ Multiset inclusion """
        return all( v <= other[k] for k,v in self.items() )


    def __lt__(self, other):
        """ Multiset strict inclusion """
        return self <= other and self != other


    # TODO : __ge__ and __gt__
    # Beware : they CANNOT be written in terms of __le__ or __lt__


a = PartiallyOrderedCounter('abc')
b = PartiallyOrderedCounter('ab')
c = PartiallyOrderedCounter('abe')

assert a <= a
assert not a < a    
assert b <= a
assert b < a
assert not a < b    
assert not c <= a
assert not a <= c

Python计数器比较为袋式

2 个答案: