在python中检查密钥,嵌套字典的值?

时间:2016-07-12 05:12:15

标签: python dictionary

我正在我的程序中生成嵌套字典。生成之后,我想遍历该字典,并检查字典键和值。

程序编码

这是我要迭代的字典,其值包含另一个字典。

 main_dict = {101: {1234: [11111,11111],5678: [44444,44444]},
              102: {9100: [55555,55555],1112: [77777,88888]}}

我正在阅读csv文件并将内容存储在此词典中。像这样:

Input.csv -

 lineno,item,total
 101,1234,11111
 101,1234,11111
 101,5678,44444
 101,5678,44444
 102,9100,55555
 102,9100,55555
 102,1112,77777
 102,1112,88888

这是输入csv文件。我正在读这个csv文件,我想知道一个独特的项目总数是重复多少次?

对于那些东西,我这样做:

for line in reader:
                if line[0] in main_dict:
                    if line[1] in main_dict[line[0]]:
                        main_dict[line[0]][line[1]].append(line[2])
                    else:
                        main_dict[line[0]].update({line[1]:[line[2]]})
                else:
                    main_dict[line[0]] = {line[1]:[line[2]]}

print main_dict

以上程序的输出:

 {101: {1234: [11111,11111],5678: [44444,44444]},
  102: {9100: [55555,55555],1112: [77777,88888]}}

但我在此行中遇到以下错误 -

 if line[1] in main_dict[line[0]]:
 IndexError: list index out of range

main_dict的迭代 -

 for key,value in main_dict.iteritems():
            f1 = open(outputfile + op_directory +'/'+ key+'.csv', 'w')
            writer1 = csv.DictWriter(f1, delimiter=',', fieldnames = fieldname)
            writer1.writeheader()
            if type(value) == type({}):
                for k,v in value.iteritems():
                    if type(v) == type([]):
                        set1 = set(v)
                        for se in set1:
                           writer1.writerow({'item':k,'total':se,'total_count':v.count(se)})

我想知道迭代这种字典的最佳方法吗?

有时候我会像上面的字典一样得到正确的结果,但很多次我面对这个错误,我错过了什么?

提前致谢!

1 个答案:

答案 0 :(得分:0)

正如评论所指出的那样,你不会检查line的长度是否为3:

for line in reader:
    if not len(line) == 3:
        continue

关于你的算法,我会使用嵌套的defaultdict来避免if / else行。

编辑:我在问题编辑后添加了一个新的defaultdict和csv写作部分:

from collections import defaultdict
import csv

counter = defaultdict(lambda: defaultdict(list))
main_dict= defaultdict(lambda: defaultdict(lambda: defaultdict(dict)))
fieldnames=['item', 'total', 'total_count']

# we suppose reader is a cvs.reader object
with open('input.csv', 'rb') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    for line in reader:
        if not len(line) == 3:
            continue
        # Remove unwanted spaces
        lineno, item, total = [el.strip() for el in line]
        # Do not deal with non digit entries (title for example)
        if not lineno.isdigit():
            continue
        counter[lineno][item].append(total)
        csvdict = {'item': item,
                   'total': total,
                   'total_count': counter[lineno][item].count(total)}
        main_dict[lineno][item][total].update(csvdict)

# The writing part
for lineno in sorted(main_dict):
    itemdict = main_dict[lineno]
    output = 'output_%s.csv' % lineno
    with open(output, 'wb') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames, delimiter=',')
        writer.writeheader()
        for totaldict in itemdict.values():
            for csvdict in totaldict.values():
                writer.writerow(csvdict)

然后,您可以使用以下函数打印结果的可读表示:

def myprint(obj, ntab=0):
    if isinstance(obj, (dict, defaultdict)):
        for k in sorted(obj):
            myprint('%s%s'%(ntab*' ', k), ntab+1)
            myprint(obj[k], ntab+1)
    else:
        print('%s%s'%(ntab*' ', obj))
myprint(main_dict)

但是如果你想计算项目总数,我会使用另一个defaultdict,其中total为键,而元组(lineno,item)为值:

from collections import defaultdict
import csv

total_dict = defaultdict(list)

# we suppose reader is a cvs.reader object
with open('input.csv', 'rb') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    for line in reader:
        if not len(line) == 3:
            continue
        # Remove unwanted spaces
        lineno, item, total = [el.strip() for el in line]
        # Do not deal with non digit entries (title for example)
        if not lineno.isdigit():
            continue
        total_dict[total].append((lineno, item))

您可以非常轻松地获得每个总数:

>>> print len(total_dict['55555'])
2