Question

我正在寻找一种提高代码性能的方法：给定两个字典，我需要找到匹配值对的键。到目前为止，我正在遍历两个字典，当两个字典都具有多达100000个键值对时，这将非常慢。

给出：

键始终为数字，并按升序排序
键都指的是我需要使用的QGIS层的功能，所以我确实需要保持这种方式
值可以具有任何数据类型，但是两者始终具有相同的数据类型
值都是随机填充的
值可以包含无法删除的重复项

有人有一个绝妙的主意如何改善性能吗？请注意，如果有充分的理由，“不，绝对不可能”也是可以接受的答案，因此我终于可以停止尝试和搜索。

dict_a = {1:'abc',2:'def',3:'abc',4:'ghj',5:'klm',6:'nop',7:'def',8:'abc',9:'xyz',10:'abc'}
dict_b = {1:'abc',2:'a',3:'b',4:'xyz',5:'abc',6:'b',7:'c',8:'def',9:'d',10:'e'}
# imagine both dictionaries have up to 100000 entries...

desired_matching_dict = {1:1,1:5,2:8,3:1,3:5,7:8,8:1,8:5,9:4,10:1,10:5} # example of my desired output
matching_dict_slow = {}
matching_dict_fast = {}

# This will be very slow when having huge dictionaries...
for key_a, value_a in dict_a.items():
    for key_b, value_b in dict_b.items():
        if value_a == value_b:
            matching_dict_slow[key_a] = key_b

# Seeking an attempt to speed this up
# But getting lost...
for key, value in dict_a.items():
    if value in dict_b.items():
        if dict_a[key] == dict_b[key]:
            matching_dict_fast[key]=dict_a[key]

print('Slow method works: ' + str(desired_matching_dict == matching_dict_slow))
print('Fast method works: ' + str(desired_matching_dict == matching_dict_fast))

Answer 1

从我通常遇到的竞争性编程用途来看，这种简单的方法应该可以正常工作：

dict_a = {1:'abc',2:'def',3:'abc',4:'ghj',5:'klm',6:'nop',7:'def',8:'abc',9:'xyz',10:'abc'}
dict_b = {1:'abc',2:'a',3:'b',4:'xyz',5:'abc',6:'b',7:'c',8:'def',9:'d',10:'e'}

dic2 = {}
for i in dict_b.keys():
    elem = dict_b[i]
    if dic2.get(elem, None):
        dic2[elem].append(i)
    else:
        dic2[elem] = [i]
matches = {}
for i in dict_a.keys():
    elem = dict_a[i]
    x = dic2.get(elem, None)
    if x:
        matches[i] = x 

print(matches) #prints {1: [1, 5], 2: [8], 3: [1, 5], 7: [8], 8: [1, 5], 9: [4], 10: [1, 5]}

然后您可以访问以下功能：

for k, v in matches.items():
    l = len(v) - 1
    i = 0
    for l in v:
        print('desired pair: ' + 'key (dict_a feature) = ' + str(k) + ' | value(dict_b feature) = ' + str(v[i]))
        i += 1

Answer 2

def dict_gen(a, b):
    for i in a:
        res = []
        for j in b:
            if a[i] == b[j]:
                res.append(j)
        if res:
            yield [(i), res]

d = dict(i for i in dict_gen(dict_a, dict_b))
print(d)

输出：

{1: [1, 5], 2: [8], 3: [1, 5], 7: [8], 8: [1, 5], 9: [4], 10: [1, 5]}
[Finished in 0.1s]

优化代码以获取两个字典中匹配值的键

2 个答案: