将两个defaultdict与特殊char合并以用于不匹配的键

时间:2019-01-30 17:28:28

标签: python python-3.x pandas collections tuples

我有两个defaultdict,如下所述:

L1 = [(10955, 'AB'), (10954, 'AB'), (10953, 'ABC'), (10952, 'ABCD'),(10951, 'ABCDEF')]
L2 = [(10956, 'A'), (10955, 'A'), (10954, 'ABE'), (10953, 'ABC'), (10952, 'ABCD')]

我想同时合并defaultdict和用'#'填充不匹配的键

RES = [(10956, '#', 'A'),(10955, 'AB', 'A'), (10954, 'AB', 'ABE'), (10953, 'ABC', 'ABC'), (10952, 'ABCD', 'ABCD'),(10951, 'ABCDEF', '#')]

2 个答案:

答案 0 :(得分:0)

只需遍历已排序的键,如果两个字典中都不存在键,则将默认值设置为'#'

from collections import OrderedDict
L1 = [(10955, 'AB'), (10954, 'AB'), (10953, 'ABC'), (10952, 'ABCD'),(10951, 'ABCDEF')]
L2 = [(10956, 'A'), (10955, 'A'), (10954, 'ABE'), (10953, 'ABC'), (10952, 'ABCD')]

L1=OrderedDict(L1)
L2=OrderedDict(L2)

sorted_keys=sorted(set(L1.keys()+L2.keys()),reverse=True) #sorting the keys in reverse

d=OrderedDict() # new orderedDict to keep the results
for i in sorted_keys:
    d[i]=(L1.get(i,'#'),L2.get(i,'#'))

这将给

OrderedDict([(10956, ('#', 'A')),
             (10955, ('AB', 'A')),
             (10954, ('AB', 'ABE')),
             (10953, ('ABC', 'ABC')),
             (10952, ('ABCD', 'ABCD')),
             (10951, ('ABCDEF', '#'))])

获得最终输出为list,然后将上面的代码修改为

lis=[]
for i in sorted_keys:
    lis.append((i,L1.get(i,'#'),L2.get(i,'#')))

输出

[(10956, '#', 'A'),
 (10955, 'AB', 'A'),
 (10954, 'AB', 'ABE'),
 (10953, 'ABC', 'ABC'),
 (10952, 'ABCD', 'ABCD'),
 (10951, 'ABCDEF', '#')]

答案 1 :(得分:0)

您可以使用熊猫:

import pandas as pd
d1 = pd.DataFrame().from_dict(dict(L1), orient='index')

d2 = pd.DataFrame().from_dict(dict(L2), orient='index')

pd.concat([d1,d2], axis=1).fillna('#').reset_index().apply(tuple, axis=1).tolist()

输出:

[(10951, 'ABCDEF', '#'), (10952, 'ABCD', 'ABCD'), (10953, 'ABC', 'ABC'), (10954, 'AB', 'ABE'), (10955, 'AB', 'A'), (10956, '#', 'A')]