合并基于特定的json列表并将其值组合到列表python中

时间:2015-08-18 14:51:56

标签: python json

我有以下格式的json。我的要求是如果“id”字段相同则更改数据,然后应将该字段的其余部分放入列表中。我尝试循环它并引用其他示例代码但我无法获得所需的结果。如果“id”相同,那么我应该将字段值的其余部分组合到一个列表中并保持密钥相同。我厌倦了根据'id'字段为新词典添加值,但结果是最后一个值或者像这样的东西

[  
    {  
        "time":" all dates ",
        "author_id":"alll ",
        "id_number":"all id_number",
        "id":"all idd"
    }
]

收到JSON:

data = [  
    {  
        "time":"2015/03/27",
        "author_id":"abc_123",
        "id":"4585",
        "id_number":123
    },
    {  
        "time":"2015/03/30",
        "author_id":"abc_123",
        "id":"7776",
        "id_number":122
    },
    {  
        "time":"2015/03/22",
        "author_id":"abc_123",
        "id":"8449",
        "id_number":111
    },
    {  
        "time":"2012/03/30",
        "author_id":"def_456",
        "id":"4585",
        "id_number":90
    }
]

必需输出:

new_data = [
    {
        "time":[
            "2015/03/27",
            "2012/03/30"
        ],
        "author_id":[
            "abc_123",
            "def_456"
        ],
        "id":"4585",
        "id_number":[
            123,
            90
        ]
    },
    {
        "time":"2015/03/30",
        "author_id":"abc_123",
        "id":"7776",
        "id_number":122
    },
    {
        "time":"2015/03/27 05:22:42",
        "author_id":"abc_123",
        "id":"8449",
        "id_number":111
    }
]

1 个答案:

答案 0 :(得分:0)

第一步可以是通过将id映射到字典来创建更规则的结构,其中所有键都映射到相应值的列表,并合并具有相同id值的原始字典。

然后在第二步中创建结果列表,方法是将id的值作为合并的字典映射,并决定值列表的长度,以便复制字典或从复制时的值中取出唯一的元素。就是这样。

#!/usr/bin/env python
# coding: utf8
from __future__ import absolute_import, division, print_function
from collections import defaultdict
from functools import partial
from pprint import pprint


def main():
    records = [
        {
            'time': '2015/03/27',
            'author_id': 'abc_123',
            'id': '4585',
            'id_number': 123
        },
        {
            'time': '2015/03/30',
            'author_id': 'abc_123',
            'id': '7776',
            'id_number': 122
        },
        {
            'time': '2015/03/22',
            'author_id': 'abc_123',
            'id': '8449',
            'id_number': 111
        },
        {
            'time': '2012/03/30',
            'author_id': 'def_456',
            'id': '4585',
            'id_number': 90
        }
    ]

    id2record = defaultdict(partial(defaultdict, list))
    for record in records:
        merged_record = id2record[record['id']]
        for key, value in record.iteritems():
            merged_record[key].append(value)

    result = list()
    for record in id2record.itervalues():
        if len(record['id']) == 1:
            result.append(dict((k, vs[0]) for k, vs in record.iteritems()))
        else:
            record['id'] = record['id'][0]
            result.append(dict(record))

    pprint(result)


if __name__ == '__main__':
    main()

如果您可以更改输出的要求,我建议摆脱值的不规则性。处理结果的代码必须处理这两种情况 - 单值和带值的列表/数组 - 这使得它比它必须要复杂一点。

更新:修复了代码中的问题。 id 值应始终为单个值,而不是列表。