python是否有类似于linux命令的命令:
cat file.txt | sort -n | uniq -c
它在每个新行上对整数文本文件的频率进行排序和计算,并以下列形式输出:
76539 1
100441 2
108637 3
108874 4
103580 5
91869 6
78458 7
61955 8
46100 9
32701 10
21111 11
13577 12
7747 13
4455 14
2309 15
1192 16
554 17
264 18
134 19
63 20
28 21
15 22
12 23
7 24
5 25
如果没有,我可以简单地os.system(cat file.txt | sort -n | uniq -c)
吗?
答案 0 :(得分:1)
import collections
c = collections.Counter()
with open('file.txt') as f:
for text in f:
c.update( [int(text.strip())] )
c_sorted = sorted(c.most_common())
for key, val in c_sorted:
print val, key
答案 1 :(得分:0)
>>> import collections
>>> collections.Counter(['asdf', 'sdfg', 'asdf', 'qwer', 'sdfg', 'asdf'])
Counter({'asdf': 3, 'sdfg': 2, 'qwer': 1})
>>> collections.Counter(map(str.strip, open('file.txt').readlines()))
Counter({'spam': 5, 'hello': 3, 'world': 2, 'eggs': 2})
答案 2 :(得分:0)
您可以使用itertools.groupby
from itertools import groupby
words = ['blah', 'blah2']
my_result = dict((key, len(list(word_group))) for key, word_group in groupby(sorted(words)))
答案 3 :(得分:0)
https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html
Might be worth considering, but the return_counts option was not available in older versions of the library, so depends on what's available to you.