优化代码以获取两个字典中匹配值的键

时间:2020-10-29 18:21:18

标签: python dictionary optimization

我正在寻找一种提高代码性能的方法:给定两个字典,我需要找到匹配值对的键。到目前为止,我正在遍历两个字典,当两个字典都具有多达100000个键值对时,这将非常慢。

给出:

    两个字典的
  • 键始终为数字,并按升序排序
  • 两个词典的
  • 键都指的是我需要使用的QGIS层的功能,所以我确实需要保持这种方式
  • 两个字典的
  • 值可以具有任何数据类型,但是两者始终具有相同的数据类型
  • 两个字典的
  • 值都是随机填充的
  • 值可以包含无法删除的重复项

有人有一个绝妙的主意如何改善性能吗?请注意,如果有充分的理由,“不,绝对不可能”也是可以接受的答案,因此我终于可以停止尝试和搜索。

dict_a = {1:'abc',2:'def',3:'abc',4:'ghj',5:'klm',6:'nop',7:'def',8:'abc',9:'xyz',10:'abc'}
dict_b = {1:'abc',2:'a',3:'b',4:'xyz',5:'abc',6:'b',7:'c',8:'def',9:'d',10:'e'}
# imagine both dictionaries have up to 100000 entries...

desired_matching_dict = {1:1,1:5,2:8,3:1,3:5,7:8,8:1,8:5,9:4,10:1,10:5} # example of my desired output
matching_dict_slow = {}
matching_dict_fast = {}

# This will be very slow when having huge dictionaries...
for key_a, value_a in dict_a.items():
    for key_b, value_b in dict_b.items():
        if value_a == value_b:
            matching_dict_slow[key_a] = key_b

# Seeking an attempt to speed this up
# But getting lost...
for key, value in dict_a.items():
    if value in dict_b.items():
        if dict_a[key] == dict_b[key]:
            matching_dict_fast[key]=dict_a[key]

print('Slow method works: ' + str(desired_matching_dict == matching_dict_slow))
print('Fast method works: ' + str(desired_matching_dict == matching_dict_fast))

2 个答案:

答案 0 :(得分:3)

从我通常遇到的竞争性编程用途来看,这种简单的方法应该可以正常工作:

dict_a = {1:'abc',2:'def',3:'abc',4:'ghj',5:'klm',6:'nop',7:'def',8:'abc',9:'xyz',10:'abc'}
dict_b = {1:'abc',2:'a',3:'b',4:'xyz',5:'abc',6:'b',7:'c',8:'def',9:'d',10:'e'}

dic2 = {}
for i in dict_b.keys():
    elem = dict_b[i]
    if dic2.get(elem, None):
        dic2[elem].append(i)
    else:
        dic2[elem] = [i]
matches = {}
for i in dict_a.keys():
    elem = dict_a[i]
    x = dic2.get(elem, None)
    if x:
        matches[i] = x 

print(matches) #prints {1: [1, 5], 2: [8], 3: [1, 5], 7: [8], 8: [1, 5], 9: [4], 10: [1, 5]}

然后您可以访问以下功能:

for k, v in matches.items():
    l = len(v) - 1
    i = 0
    for l in v:
        print('desired pair: ' + 'key (dict_a feature) = ' + str(k) + ' | value(dict_b feature) = ' + str(v[i]))
        i += 1

答案 1 :(得分:0)

def dict_gen(a, b):
    for i in a:
        res = []
        for j in b:
            if a[i] == b[j]:
                res.append(j)
        if res:
            yield [(i), res]

d = dict(i for i in dict_gen(dict_a, dict_b))
print(d)

输出:

{1: [1, 5], 2: [8], 3: [1, 5], 7: [8], 8: [1, 5], 9: [4], 10: [1, 5]}
[Finished in 0.1s]