如何在Python中根据Key对Json数据进行分组并合并值?

时间:2019-01-03 09:50:53

标签: python json

具有一个包含如下所示结构的json: 我正在尝试基于“名称”对数据进行分组和合并,例如下面的两个示例具有相同的名称,即“ abc”,因此它们将被合并到单个集合中,并且“ id”和“ nested prop”将是具有两个内容的“名称”下的数组

 [{
"id" : [{"random" : "12345"}],
"name" : "abc",
"nestedprop" : {
    "malfunc" : [ 
        {
            "Info" : {
                "xyz" : [ 
                    {
                        "vamp" : "104531_0_46095",
                        "ramp" : {
                            "samp" : [ 
                                {
                                    "int" : 532,
                                }
                            ],
                        },
                        "class_unique_id" : "05451",
                    }
                ],
                "nati" : 39237,
                "apper" : 00,
                "supp" : {
                    "sess" : ""
                },
                "session_id" : "42461920181213044516299872341"
            }
        },       
                      {
            "Info" : {
                "xyz" : [ 
                    {
                        "vamp" : "104531_0_46095",
                        "ramp" : {
                            "samp" : [ 
                                {
                                    "int" : 5325,
                                }
                            ],
                        },
                        "class_unique_id" : "05451",
                    }
                ],
                "nati" : 392537,
                "apper" : 00,
                "supp" : {
                    "sess" : ""
                },
                "session_id" : "42461920181213044516299872341"
            }
        },
    ]
  },
},
   { 
"id" : [{"asdad" : "63653"}],
 "name" : "abc",
 "nestedprop" : {
    "malfunc" : [ 
        {
            "Info" : {
                "xyz" : [ 
                    {
                        "vamp" : "104531_0_46095",
                        "ramp" : {
                            "samp" : [ 
                                {
                                    "int" : 532,
                                }
                            ],
                        },
                        "class_unique_id" : "05451",
                    }
                ],
                "nati" : 39237,
                "apper" : 00,
                "supp" : {
                    "sess" : ""
                },
                "session_id" : "42461920181213044516299872341"
            }
        },
      {
            "Info" : {
                "xyz" : [ 
                    {
                        "vamp" : "104531_0_46095",
                        "ramp" : {
                            "samp" : [ 
                                {
                                    "int" : 532,
                                }
                            ],
                        },
                        "class_unique_id" : "05451",
                    }
                ],
                "nati" : 39237,
                "apper" : 00,
                "supp" : {
                    "sess" : ""
                },
                "session_id" : "42461920181213044516299872341"
            }
        },
    ]
  }
 }]

预期结果:

[{
 "id" : {"0":[{"random" : "12345"}],"1":[{"random" : "12345"}]},
 "name" : "abc",
 "nestedprop" : {"0":[{"content from 1st obj"}],"1":[{"content from 1st obj"}]
 },
 {"id" : "0":[{"randsdaom" : "123sdas45"}]},
 "name" : "def",
 "nestedprop" : {"0":[{"content from 1st obj"}]}]

注意:我已经在Mongodb中使用Mapreduce实现了此功能,但是我想在Python中进行尝试

1 个答案:

答案 0 :(得分:1)

可能还有许多其他方法,但是这种方法可以为您带来理想的结果,

names = []
for dic in dictList:
    names.append(dic['name'])

# unique list of names
names = list(set(names))


results = []

for name in names:
    idx = 0
    ids = {}
    props = {}
    names = []
    for dic in dictList:
        dic_name = dic['name']
        if dic_name == name:
            ids[str(idx)] = dic['id']
            props[str(idx)] = dic['nestedprop']
            idx += 1
    result_dict = {"name": name,
                   "id": ids,
                   "nestedprop": props}     

    results.append(result_dict)

# result = json.dumps(result)  to convert back into double quotes ""  
results

输出:

   [{'name': 'abc',
  'id': {'0': [{'random': '12345'}],
   '1': [{'asdad': '63653'}],
   '2': [{'asdad': '63653'}]},
  'nestedprop': {'0': {'malfunc': [{'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 532}]},
         'class_unique_id': '05451'}],
       'nati': 39237,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}},
     {'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 5325}]},
         'class_unique_id': '05451'}],
       'nati': 392537,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}}]},
   '1': {'malfunc': [{'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 532}]},
         'class_unique_id': '05451'}],
       'nati': 39237,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}},
     {'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 532}]},
         'class_unique_id': '05451'}],
       'nati': 39237,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}}]},
   '2': {'malfunc': [{'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 532}]},
         'class_unique_id': '05451'}],
       'nati': 39237,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}},
     {'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 532}]},
         'class_unique_id': '05451'}],
       'nati': 39237,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}}]}}},
 {'name': 'def',
  'id': {'0': [{'asdad': '63653'}], '1': [{'asdad': '63653'}]},
  'nestedprop': {'0': {'malfunc': [{'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 532}]},
         'class_unique_id': '05451'}],
       'nati': 39237,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}},
     {'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 532}]},
         'class_unique_id': '05451'}],
       'nati': 39237,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}}]},
   '1': {'malfunc': [{'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 532}]},
         'class_unique_id': '05451'}],
       'nati': 39237,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}},
     {'Info': {'xyz': [{'vamp': '104531_0_46095',
         'ramp': {'samp': [{'int': 532}]},
         'class_unique_id': '05451'}],
       'nati': 39237,
       'apper': 0,
       'supp': {'sess': ''},
       'session_id': '42461920181213044516299872341'}}]}}}]