Question

例如，我需要计算一个单词出现在列表中的次数，不是按频率排序，而是按单词出现的顺序排序，即插入顺序。

from collections import Counter

words = ['oranges', 'apples', 'apples', 'bananas', 'kiwis', 'kiwis', 'apples']

c = Counter(words)

print(c)

所以代替：{'apples': 3, 'kiwis': 2, 'bananas': 1, 'oranges': 1}

我宁愿得到：{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}

我并不需要这种Counter方法，任何可以产生正确结果的方法对我来说都是可以的。

Answer 1

您可以使用使用collections.Counter和collections.OrderedDict的{{3}}：

from collections import Counter, OrderedDict

class OrderedCounter(Counter, OrderedDict):
    'Counter that remembers the order elements are first encountered'

    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

    def __reduce__(self):
        return self.__class__, (OrderedDict(self),)

words = ["oranges", "apples", "apples", "bananas", "kiwis", "kiwis", "apples"]
c = OrderedCounter(words)
print(c)
# OrderedCounter(OrderedDict([('oranges', 1), ('apples', 3), ('bananas', 1), ('kiwis', 2)]))

Answer 2

在Python 3.6+上，dict现在将保持插入顺序。

因此您可以这样做：

words = ["oranges", "apples", "apples", "bananas", "kiwis", "kiwis", "apples"]
counter={}
for w in words: counter[w]=counter.get(w, 0)+1
>>> counter
{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}

不幸的是，Python 3.6和3.7中的Counter不会显示其维护的插入顺序。相反，__repr__ sorts the return由最常见到最不常见。

但是您可以使用相同的OrderedDict recipe，但只需使用Python 3.6+字典即可：

from collections import Counter

class OrderedCounter(Counter, dict):
    'Counter that remembers the order elements are first encountered'
    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__, dict(self))

    def __reduce__(self):
        return self.__class__, (dict(self),)

>>> OrderedCounter(words)
OrderedCounter({'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2})

或者，由于Counter是dict的子类，在Python 3.6+中保持顺序，因此您可以通过在计数器上调用__repr__或转动计数器来避免使用Counter的.items()回到dict：

>>> c=Counter(words)

该计数器的显示按照最常见的元素进行排序，并使用计数器__repr__方法：

>>> c
Counter({'apples': 3, 'kiwis': 2, 'oranges': 1, 'bananas': 1})

此演示文稿按遇到的顺序或插入顺序：

>>> c.items()
dict_items([('oranges', 1), ('apples', 3), ('bananas', 1), ('kiwis', 2)])

或者，

>>> dict(c)
{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}

Answer 3

在 Python 3.6 中，字典按插入顺序排列，但这是一个实现细节。

在 Python 3.7 + 中，可以保证插入顺序，并且可以依赖插入顺序。有关更多详细信息，请参见Are dictionaries ordered in Python 3.6+?。

因此，根据您的Python版本，您可能希望按原样使用Counter，而不创建documentation中所述的OrderedCounter类。之所以有效，是因为Counter是dict的子类，即issubclass(Counter, dict)返回True，因此继承了dict的插入顺序。

字符串表示形式

值得注意的是，Counter方法中定义的repr的字符串表示形式has not been updated反映了3.6 / 3.7的变化，即print(Counter(some_iterable))从最大数量降序返回项目。您可以通过list(Counter(some_iterable))简单地返回插入顺序。

以下是一些说明行为的示例：

x = 'xyyxy'
print(Counter(x))         # Counter({'y': 3, 'x': 2}), i.e. most common first
print(list(Counter(x)))   # ['x', 'y'], i.e. insertion ordered
print(OrderedCounter(x))  # OC(OD([('x', 2), ('y', 3)])), i.e. insertion ordered

例外

如果Counter可用的其他或覆盖的方法对您很重要，则不应使用常规的OrderedCounter。特别注意：

OrderedDict，因此OrderedCounter提供了popitem和move_to_end方法。
OrderedCounter对象之间的相等性测试是顺序敏感的，并以list(oc1.items()) == list(oc2.items())的形式实现。

例如，相等性测试将产生不同的结果：

Counter('xy') == Counter('yx')                # True
OrderedCounter('xy') == OrderedCounter('yx')  # False

Answer 4

在评论中解释

text_list = ['oranges', 'apples', 'apples', 'bananas', 'kiwis', 'kiwis', 'apples']


# create empty dictionary
freq_dict = {}
 
# loop through text and count words
for word in text_list:
    # set the default value to 0
    freq_dict.setdefault(word, 0)
    # increment the value by 1
    freq_dict[word] += 1
 
print(freq_dict )

{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}

[Program finished]

如何计算物品的数量，但保持它们出现的顺序？

4 个答案: