计算使用多个键找到字典值的次数

时间:2013-09-03 00:20:47

标签: python dictionary

我在python工作。有没有办法计算一个字典中的值多少次与多个键,然后返回一个计数?

因此,例如,如果我有50个值,并且我运行了一个脚本来执行此操作,我会得到一个看起来像这样的计数:

1: 23  
2: 15  
3: 7  
4: 5  

以上将告诉我,1个键中出现23个值,2个键中出现15个值,3个键中出现7个值,4个键中出现5个值。

此外,如果我的字典中每个键有多个值,这个问题会改变吗?

这是我的字典样本(它的细菌名称):

{'0': ['Pyrobaculum'], '1': ['Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium'], '3': ['Thermoanaerobacter', 'Thermoanaerobacter'], '2': ['Helicobacter', 'Mycobacterium'], '5': ['Thermoanaerobacter', 'Thermoanaerobacter'], '4': ['Helicobacter'], '7': ['Syntrophomonas'], '6': ['Gelria'], '9': ['Campylobacter', 'Campylobacter'], '8': ['Syntrophomonas'], '10': ['Desulfitobacterium', 'Mycobacterium']}

因此,从这个样本中,有8个唯一值,我将得到理想的反馈:

1:4
2:3
3:1

因此4个细菌名称仅在一个键中,在两个键中发现3个细菌,在三个键中发现1个细菌。

3 个答案:

答案 0 :(得分:5)

如果我理解正确,您需要计算字典值的计数。如果值可以通过collections.Counter计算,则只需要在字典值上调用Counter,然后再在第一个计数器的值上调用range(100)。下面是一个使用字典的示例,其中键为from collections import Counter d = dict(enumerate([str(random.randint(0, 10)) for _ in range(100)])) counter = Counter(d.values()) counts_counter = Counter(counter.values()) ,值在0到10之间是随机的:

d

修改

将样本字典添加到问题后,您需要以稍微不同的方式进行第一次计数(from collections import Counter c = Counter() for v in d.itervalues(): c.update(set(v)) Counter(c.values()) 是问题中的字典):

{{1}}

答案 1 :(得分:5)

所以,除非我读错了,否则你想知道:

  • 对于原始字典中的每个值,每个不同的值计数会出现多少次?
  • 本质上你想要的是字典中值的frequency

我采取了一种不太优雅的方法,而另一方则回答,但已经将问题分解为各个步骤:

d = {'0': ['Pyrobaculum'], '1': ['Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium', 'Mycobacterium'], '3': ['Thermoanaerobacter', 'Thermoanaerobacter'], '2': ['Helicobacter', 'Mycobacterium'], '5': ['Thermoanaerobacter', 'Thermoanaerobacter'], '4': ['Helicobacter'], '7': ['Syntrophomonas'], '6': ['Gelria'], '9': ['Campylobacter', 'Campylobacter'], '8': ['Syntrophomonas'], '10': ['Desulfitobacterium', 'Mycobacterium']}

# Iterate through and find out how many times each key occurs
vals = {}                       # A dictonary to store how often each value occurs.
for i in d.values():
  for j in set(i):              # Convert to a set to remove duplicates
    vals[j] = 1 + vals.get(j,0) # If we've seen this value iterate the count
                                # Otherwise we get the default of 0 and iterate it
print vals

# Iterate through each possible freqency and find how many values have that count.
counts = {}                     # A dictonary to store the final frequencies.
# We will iterate from 0 (which is a valid count) to the maximum count
for i in range(0,max(vals.values())+1):
    # Find all values that have the current frequency, count them
    #and add them to the frequency dictionary
    counts[i] = len([x for x in vals.values() if x == i])

for key in sorted(counts.keys()):
  if counts[key] > 0:
     print key,":",counts[key]

您也可以test this code on codepad

答案 2 :(得分:2)

您可以使用Counter

>>>from collections import Counter
>>>d = dict(((1, 1), (2, 1), (3, 1), (4, 2), (5, 2), (6, 3), (7, 3)))
>>>d
{1: 1, 2: 1, 3: 1, 4: 2, 5: 2, 6: 3, 7: 3}
>>>Counter(d.values())
Counter({1: 3, 2: 2, 3: 2})