如何合并具有重复键的字典列表

时间:2019-09-21 03:57:29

标签: python arrays pandas list dictionary

我有一个词典列表:

[
    {
        'Tahsin': [
            {'January': 1}
        ]
    },
    {
        'Arabic Language': [
            {'September': 1}
        ]
    },
    {
        'Arabic Language': [
            {'August': 2}
        ]
    },
    {
        'Arabic Language': [
            {'August': 2}
        ]
    }
]

我想将相同键下的值合并,并删除重复项。

我尝试了这个following code

list_of_unique_dicts = []
for dict_ in student_per_course:
    if dict_ not in list_of_unique_dicts:
       list_of_unique_dicts.append(dict_)

我得到了结果:

[
    {
        'Tahsin': [
            {'January': 1}
        ]
    }, 
    {'Arabic Language': [
        {'September': 1}
        ]
    },
    {
        'Arabic Language': [
            {'August': 2}
        ]
    }
]

效果不理想,该月的值仍然重复。

然后,我尝试了此following code

bar = {
        k: [d.get(k) for d in list_of_unique_dicts]
        for k in set().union(*list_of_unique_dicts)
    }

并得到以下结果:

{
    'Tahsin': [
        [
            {'January': 1}
        ],
        None, None
    ], 
    'Arabic Language': [
        None, 
        [
            {'September': 1}
        ],
        [
            {'August': 2}
        ]
    ]
}

仍然不是完美的结果^ _ ^。

我还尝试将熊猫与following code配合使用:

res = pd.DataFrame(list_of_unique_dicts).to_dict(orient='list')

并得到以下结果:

{
    'Tahsin': [
        [
            {'January': 1}
        ], 
        nan, nan
    ], 
    'Arabic Language': [
        nan, 
        [
            {'September': 1}
        ],
        [
            {'August': 2}
        ]
    ]
}

以上仍然不是我想要的结果。

预期结果应该是:

[
    {
        'Tahsin': [
            {'January': 1}
        ]
    },
    {
        'Arabic Language':
            [
                {'September': 1,
                 'August': 2
                 }
            ]
    },
]

合并第一个结果的值。

那么,怎么做..?,任何帮助将不胜感激:)

1 个答案:

答案 0 :(得分:2)

Ide首先是通过展平创建元组集:

L = set([(k, k1, v1) for d in L for k, v in d.items() for y in v for k1, v1 in y.items()])
print (L)
{('Arabic Language', 'August', 2), 
 ('Tahsin', 'January', 1), 
 ('Arabic Language', 'September', 1)}

然后转换回您的结构:

from collections import defaultdict
out = defaultdict(dict)
for a,b,c in L:
    out[a][b] = c

out = [{k: [v] for k, v in out.items()}]
print(out)
[{'Arabic Language': [{'August': 2, 'September': 1}], 'Tahsin': [{'January': 1}]}]