python庞大的字典比较

时间:2015-09-23 11:53:20

标签: python list dictionary optimization

我有两个字典和一个列表如下

list1 =[['3', {'3': ['4'], '10': ['2'], '9': ['8'], '11': ['8']}],
    ['8', {'7': ['8'], '6': ['9'], '3': ['1']}],
    ['7', {'5': ['11'], '10': ['6'], '2': ['3']}],
    ['9', {'4': ['1']}]
    ]
list2 ={0: -2.829363315837061, 1: -3.483150971596311, 2: -3.55866903680906, 3: -3.644673448138691, 4: -3.78, 5: -3.9343704669124677, 6: -4.1158785480167435, 7: -4.074895059134982, 8: -4.397225116848732, 9: -4.425674125747298, 10: -4.416164011466592, 11: -4.906491662382141}

list3 ={0: -2.865996006819783, 1: -3.6503055799900492, 2: -3.58670223884185, 3: -3.73129019873609, 4: -3.73, 5: -4.049442571308586, 6: -4.086222130931718, 7: -4.19022476024935, 8: -4.243919389901362, 9: -4.246976004644184, 10: -4.334028831306514, 11: -4.678255063114617}

如果密钥相同,我试图根据list1中的键从两个字典(list2和list3)中获取与键相关的值,然后将list2的值与list1相同list1的值,并在list1.same情况下为列表3中的每个字典列表添加这些值。

for index in range(len(list1)):
tot_pos_probability = 0
tot_neg_probability =0
    for the_key, the_value in list1[index][1].items():
        for item in list2.keys():
            if int(the_key) == item:
                tot_pos_probability += int(the_value[0])*list2.get(item)
        for elem in list3.keys():
            if int(the_key) == elem:
                tot_neg_probability += int(the_value[0])*list3.get(elem)

上面的代码工作正常,并为我提供了上述列表和字典示例的预期结果。

但我的原始list1大小约为15000,list1中的每个字典包含大约200-400个键值对。同样,两个词典list2和list3也包含大约10000个唯一键值对。上面的代码工作真的很差case.I我无法得到任何结果。它一直运行10分钟没有结果。请你帮我一个优化的解决方案,在这种情况下工作得非常好。

2 个答案:

答案 0 :(得分:0)

您不需要内部for循环,只需检查the_key是否为list2或list3中的键,然后使用您的公式计算tot_pos_probability }和tot_neg_probability。您可以使用.get()使用默认值0,以便在key不存在时不会更改概率。示例 -

for list1elem in list1:
    tot_pos_probability = 0
    tot_neg_probability =0
    for the_key, the_value in list1elem[1].items():
        tot_pos_probability += int(the_value[0])*list2.get(int(the_key), 0)
        tot_neg_probability += int(the_value[0])*list3.get(int(the_key), 0)

答案 1 :(得分:0)

通过以下方法,我看到了相当大的性能提升。它可能对你有所帮助:)。

       for each in list1:
            mykeys = each[1].keys()
            mylist = map(int,mykeys)
            common1 = set(mylist) & set(list2)
            common2 = set(mylist) & set(list3)
            if common1:
                tot_pos_probability = map(lambda ele: int(each[1][str(ele)][0])*list2[ele], common1)
                print sum(tot_pos_probability)
            if common2:
                tot_neg_probability = map(lambda ele: int(each[1][str(ele)][0])*list2[ele], common2)
                print sum(tot_neg_probability)