Python计算字典中的键数

时间:2013-09-16 10:43:53

标签: python dictionary

我创建了一个我文件的嵌套字典,用于对类中的事件进行分组。我想用关键数字来计算我有多少个类以及有多少个最终值。这是我到目前为止的代码:

infile = open('ALL','r')

def round_down(num):
    return num - (num%100)

count = 0
a = []
split_region = {}
lengths = []
for region in infile:
    #print region

    (cov,chrm,pos,end,leng) = region.split()
    start = int(pos)#-1#-int(leng) ## loosen conditions about break points
    end = int(end)
    lengths = int(leng)
    coverage=int(cov)
    rounded_start=round_down(start)
    rounded_length=round_down(lengths)
    if not (chrm in split_region):
        split_region[chrm]={}
    if not (rounded_start in split_region[chrm]):
        split_region[chrm][rounded_start]={}
    if not (rounded_length in split_region[chrm][rounded_start]):
        split_region[chrm][rounded_start][rounded_length]= []
    split_region[chrm][rounded_start][rounded_length].append({'start':start,'length':lengths,'cov':coverage})

    for k,v in split_region[chrm][rounded_start].items():
        print len(v),k,v
        a.append(len(v))
        count +=1
print count
print sum(a)

文件的格式如下:

5732    chrM    1   16572   16571
804 chr6    58773612    58780166    6554
722 chr1    142535435   142538993   3558
448 chrY    13447747    13451695    3948
372 chr9    68422753    68423813    1060
327 chr2    133017433   133018716   1283
302 chr18   107858  109884  2026
256 chr20   29638813    29641416    2603
206 chr6    57423087    57429121    6034
204 chr1    142537237   142538991   1754

所以它基本上是通过将数字向下舍入100并在我的字典中创建一个类来实现的。它是嵌套的,因为首先我通过舍入开始然后舍入长度变量。

在代码的最后,我尝试计算有多少类,以及我的值的总数。但是这会输出错误:输入文件中的行数多于类别。有任何想法如何解决这个问题?

1 个答案:

答案 0 :(得分:0)

我不清楚你想要的总数,但也许你正在寻找以下之一:

rounded_start_count = 0
rounded_length_count = 0
rounded_length_value_count = 0

for k1, v1 in split_region.items():
    print k1 + ": " + str(len(v1))
    rounded_start_count += len(v1)
    for k2, v2 in v1.items():
        rounded_length_count += len(v2)
        rounded_length_value_count += len(v2.values())

print ""

print "chrm count:                 ", len(split_region.keys())
print "Rounded start count:        ", rounded_start_count
print "Rounded length count:       ", rounded_length_count
print "Rounded length value count: ", rounded_length_count

这将放在你的for循环之后。这将为您的样本数据打印以下输出:

chr6: 2
chr2: 1
chr1: 2
chr9: 1
chrY: 1
chr20: 1
chrM: 1
chr18: 1

chrm count:                  8
Rounded start count:         10
Rounded length count:        10
Rounded length value count:  10