在嵌套字典中将多个相同的键合并为一个

时间:2019-09-12 20:03:38

标签: python dictionary

我在列表中有一个嵌套的字典,看起来像这样:

      my_list =

      [{'id': '166073',
        'ref': [{'MeSH': 'C548074'},
       {'UMLS': 'C1969084'},
       {'OMIM': '611523'},
       {'ICD-10': 'Q04.3'}]},

       {'id': '213',
       'ref': [{'MeSH': 'D003554'},
       {'UMLS': 'C0010690'},
       {'MedDRA': '10011777'},
       {'ICD-10': 'E72.0'},
       {'OMIM': '219750'},
       {'OMIM': '219800'},
       {'OMIM': '219900'}]},

       {'id': '333',
        'ref': [{'UMLS': 'C2936785'},
       {'ICD-10': 'E75.2'},
       {'MeSH': 'C537075'},
       {'MeSH': 'D055577'},
       {'UMLS': 'C0268255'},
       {'OMIM': '228000'}]}
         .
         .
         .               
                         ]

我想将嵌套字典中具有相同键的字典与键中的列表合并,如下所示:

    my_list =

    [{'id': '166073',
      'ref': [{'MeSH': 'C548074'},
       {'UMLS': 'C1969084'},
       {'OMIM': '611523'},
       {'ICD-10': 'Q04.3'}]},

     {'id': '213',
      'ref': [{'MeSH': 'D003554'},
       {'UMLS': 'C0010690'},
       {'MedDRA': '10011777'},
       {'ICD-10': 'E72.0'},
       {'OMIM': ['219750', '219800', '219900']}]},

     {'id': '333',
      'ref': [{'UMLS': 'C2936785'},
       {'ICD-10': 'E75.2'},
       {'MeSH': ['C537075', 'D055577']},
       {'UMLS': 'C0268255'},
       {'OMIM': '228000'}]}
         .
         .
         .               
                         ]

我试图通过使用双for循环读取字典来合并,并将信息存储到另一本新字典中,但是我发现该方法不是最佳方法,是否有其他建议的方法来完成这种合并?谢谢!

4 个答案:

答案 0 :(得分:3)

为什么它不是最佳选择? 我认为这种合并应该很好。 我认为您的合并将是:

my_list = [{'id': '166073',
        'ref': [{'MeSH': 'C548074'},
       {'UMLS': 'C1969084'},
       {'OMIM': '611523'},
       {'ICD-10': 'Q04.3'}]},

       {'id': '213',
       'ref': [{'MeSH': 'D003554'},
       {'UMLS': 'C0010690'},
       {'MedDRA': '10011777'},
       {'ICD-10': 'E72.0'},
       {'OMIM': '219750'},
       {'OMIM': '219800'},
       {'OMIM': '219900'}]},

       {'id': '333',
        'ref': [{'UMLS': 'C2936785'},
       {'ICD-10': 'E75.2'},
       {'MeSH': 'C537075'},
       {'MeSH': 'D055577'},
       {'UMLS': 'C0268255'},
       {'OMIM': '228000'}]}]

def merge(item):
  from collections import defaultdict
  merged = defaultdict(list)
  for ref in item.get('ref', []):
    for key, val in ref.items():
      merged[key].append(val)
  return {**item, 'ref': dict(merged)}

print(list(map(merge, my_list)))

答案 1 :(得分:1)

#!/usr/bin/env python                                                                                                                                                                                                                                                       

o = {'id': '213',
     'ref': [{'MeSH': 'D003554'},
             {'UMLS': 'C0010690'},
             {'MedDRA': '10011777'},
             {'ICD-10': 'E72.0'},
             {'OMIM': '219750'},
             {'OMIM': '219800'},
             {'OMIM': '219900'}]}

n = {'id': o['id'],
     'ref': {x:[] for x in set([item for sublist in o['ref'] for item in sublist])}}

for p in o['ref']:
    for k, v in p.items():
        n['ref'][k].append(v)

n['ref'] = [n['ref']]

print(n)

答案 2 :(得分:1)

我发现创建字典来收集值最简单,而不是将其解压缩为所需格式:

new_list = []

for item in my_list:
    d = {'id': item['id'], 'ref': {}}
    for r in item['ref']:
        only_key = list(r.keys())[0]
        d['ref'][only_key] = d['ref'].get(only_key, []) + [r[only_key]]
    new_list.append(d)

    new_ref = []
    for k, v in d['ref'].items():
        new_ref.append({k: v if len(v) > 1 else v[0]})
    d['ref'] = new_ref



[{'id': '166073', 'ref': [{'OMIM': '611523'}, {'MeSH': 'C548074'}, {'ICD-10': 'Q04.3'}, {'UMLS': 'C1969084'}]},
 {'id': '213', 'ref': [{'MeSH': 'D003554'}, {'UMLS': 'C0010690'}, {'MedDRA': '10011777'}, {'ICD-10': 'E72.0'}, {'OMIM': ['219750', '219800', '219900']}]},
 {'id': '333', 'ref': [{'ICD-10': 'E75.2'}, {'OMIM': '228000'}, {'MeSH': ['C537075', 'D055577']}, {'UMLS': ['C2936785', 'C0268255']}]}]

答案 3 :(得分:1)

使用python的列表理解:

def merge(item):
  from collections import defaultdict
  merged = defaultdict(list)

  [[merged[k].append(v) for k, v in ref.items()] for ref in item.get('ref', [])]

  return {**item, 'ref': dict(merged)}