Question

我有一个由以下标签组成的文档：

如果我想计算每个标签的出现次数并打印出来，我怎么能在python中这样做？

我想要的是：

201: 3
202: 1
204: 1

Answer 1

使用collections模块中的Counter将密钥映射为字符串及其计数

>>> from collections import Counter
>>> 
>>> s
'202\n205\n201\n203\n204\n201\n'
>>> s = '''
201
202
205
201
203
204
201
'''
>>> c=Counter()
>>> for d in s.rstrip().split():
        c[d] += 1


>>> c
Counter({'201': 3, '205': 1, '204': 1, '203': 1, '202': 1})

或Kevin Guan建议：

>>> c = Counter(s.rstrip().split())

修改

我认为这可以通过这种方式进一步简单地完成：

>>> l = s.rstrip().split() >>> l ['201', '202', '205', '201', '203', '204', '201'] >>> c = [l.count(x) for x in l] >>> >>> c [1, 1, 1, 3, 1] >>> >>> d = dict(zip(l,c)) >>> >>> d {'205': 1, '201': 3, '203': 1, '204': 1, '202': 1}

如果你对一个班轮表达很开心，那么：

>>> l = s.rstrip().split() >>> >>> dict(zip(l,map(l.count, l))) {'205': 1, '204': 1, '201': 3, '203': 1, '202': 1} >>> >>> dict(zip(set(l),map(l.count, set(l)))) {'205': 1, '201': 3, '203': 1, '204': 1, '202': 1}

Answer 2

试试这个：

import itertools

with open("your_document") as f:
    lines = sorted(map(str.int, f.read().strip().split()))
    for x,y in itertools.groupby(lines):
        print x, list(y)

如果你的文件很像Gb的

import collections
my_dict = collections.defaultdict(int)
with open("your_document") as f:
    for line in f:
        my_dict[line] += 1

输出：

>>> my_dict
defaultdict(<type 'int'>, {'201': 2, '203': 1, '202': 1, '205': 1, '204': 1})

没有集合或itertools：

my_dict = {}
with open("your_document") as f:
    for line in f:
        line = line.strip()
        my_dict[line] = my_dict.get(line, 0) + 1

Answer 3

您可以使用readlines()方法返回行列表，然后使用collections模块中的Counter返回每个“标签”的计数。

>>> with open('text.txt') as f:
...     c = Counter(map(str.strip, f.readlines()))
...     print(c)
... 
Counter({'201': 3, '205': 1, '202': 1, '204': 1, '203': 1})

如何计算唯一标签的出现次数？

3 个答案: