gzip_files=["complete-credit-ctrl-txn-SE06_2013-07-17-00.log.gz","complete-credit-ctrl-txn-SE06_2013-07-17-01.log.gz"]
def input_func():
num = input("Enter the number of MIN series digits: ")
return num
for i in gzip_files:
import gzip
f=gzip.open(i,'rb')
file_content=f.read()
digit = input_func()
file_content = file_content.split('[')
series = [] #list of MIN
for line in file_content:
MIN = line.split('|')[13:15]
for x in MIN:
n = digit
x = x[:n]
series.append(x)
break
#count the number of occurences in the list named series
for i in series:
print i
#end count
结果:
63928
63928
63929
63929
63928
63928
这只是结果的一部分。实际结果显示了一个非常长的列表。现在我想列出唯一的数字并指定它在列表中显示的次数。 所以
63928 = 4,
63929 = 2
答案 0 :(得分:4)
我会在这里使用collections.Counter
课程。
>>> a = [1, 1, 1, 2, 3, 4, 4, 5]
>>> from collections import Counter
>>> Counter(a)
Counter({1: 3, 4: 2, 2: 1, 3: 1, 5: 1})
只需将series
变量传递给Counter
,您就会得到一个字典,其中键是唯一元素,值是列表中的出现值。
collections.Counter 是在Python 2.7中引入的。对2.7以下版本使用以下列表推导
>>> [(elem, a.count(elem)) for elem in set(a)]
[(1, 3), (2, 1), (3, 1), (4, 2), (5, 1)]
然后,您可以将其转换为字典以便于访问。
>>> dict((elem, a.count(elem)) for elem in set(a))
{1: 3, 2: 1, 3: 1, 4: 2, 5: 1}
答案 1 :(得分:1)
您可以使用Counter()
。
所以这将打印出你需要的东西:
from collections import Counter
c = Counter(series)
for item,count in c.items():
print "%s = %s" % (item,count)
答案 2 :(得分:0)
使用唯一数字作为键编译字典,并将其总出现次数作为值:
d = {} #instantiate dictionary
for s in series:
# set default key and value if key does not exist in dictionary
d.setdefault(s, 0)
# increment by 1 for every occurrence of s
d[s] += 1
如果这个问题更复杂。 map
reduce
(又名map
fold
)的实施可能是合适的。
Map Reduce: https://en.wikipedia.org/wiki/MapReduce
Python map
函数:
http://docs.python.org/2/library/functions.html#map
Python reduce
函数:
http://docs.python.org/2/library/functions.html#reduce