Question

我有两个defaultdict，如下所述：

L1 = [(10955, 'AB'), (10954, 'AB'), (10953, 'ABC'), (10952, 'ABCD'),(10951, 'ABCDEF')]
L2 = [(10956, 'A'), (10955, 'A'), (10954, 'ABE'), (10953, 'ABC'), (10952, 'ABCD')]

我想同时合并defaultdict和用'＃'填充不匹配的键

RES = [(10956, '#', 'A'),(10955, 'AB', 'A'), (10954, 'AB', 'ABE'), (10953, 'ABC', 'ABC'), (10952, 'ABCD', 'ABCD'),(10951, 'ABCDEF', '#')]

Answer 1

只需遍历已排序的键，如果两个字典中都不存在键，则将默认值设置为'#'

from collections import OrderedDict
L1 = [(10955, 'AB'), (10954, 'AB'), (10953, 'ABC'), (10952, 'ABCD'),(10951, 'ABCDEF')]
L2 = [(10956, 'A'), (10955, 'A'), (10954, 'ABE'), (10953, 'ABC'), (10952, 'ABCD')]

L1=OrderedDict(L1)
L2=OrderedDict(L2)

sorted_keys=sorted(set(L1.keys()+L2.keys()),reverse=True) #sorting the keys in reverse

d=OrderedDict() # new orderedDict to keep the results
for i in sorted_keys:
    d[i]=(L1.get(i,'#'),L2.get(i,'#'))

这将给

OrderedDict([(10956, ('#', 'A')),
             (10955, ('AB', 'A')),
             (10954, ('AB', 'ABE')),
             (10953, ('ABC', 'ABC')),
             (10952, ('ABCD', 'ABCD')),
             (10951, ('ABCDEF', '#'))])

获得最终输出为list，然后将上面的代码修改为

lis=[]
for i in sorted_keys:
    lis.append((i,L1.get(i,'#'),L2.get(i,'#')))

输出

[(10956, '#', 'A'),
 (10955, 'AB', 'A'),
 (10954, 'AB', 'ABE'),
 (10953, 'ABC', 'ABC'),
 (10952, 'ABCD', 'ABCD'),
 (10951, 'ABCDEF', '#')]

Answer 2

您可以使用熊猫：

import pandas as pd
d1 = pd.DataFrame().from_dict(dict(L1), orient='index')

d2 = pd.DataFrame().from_dict(dict(L2), orient='index')

pd.concat([d1,d2], axis=1).fillna('#').reset_index().apply(tuple, axis=1).tolist()

输出：

[(10951, 'ABCDEF', '#'), (10952, 'ABCD', 'ABCD'), (10953, 'ABC', 'ABC'), (10954, 'AB', 'ABE'), (10955, 'AB', 'A'), (10956, '#', 'A')]

将两个defaultdict与特殊char合并以用于不匹配的键

2 个答案: