我有以下格式的json。我的要求是如果“id”字段相同则更改数据,然后应将该字段的其余部分放入列表中。我尝试循环它并引用其他示例代码但我无法获得所需的结果。如果“id”相同,那么我应该将字段值的其余部分组合到一个列表中并保持密钥相同。我厌倦了根据'id'字段为新词典添加值,但结果是最后一个值或者像这样的东西
[
{
"time":" all dates ",
"author_id":"alll ",
"id_number":"all id_number",
"id":"all idd"
}
]
收到JSON:
data = [
{
"time":"2015/03/27",
"author_id":"abc_123",
"id":"4585",
"id_number":123
},
{
"time":"2015/03/30",
"author_id":"abc_123",
"id":"7776",
"id_number":122
},
{
"time":"2015/03/22",
"author_id":"abc_123",
"id":"8449",
"id_number":111
},
{
"time":"2012/03/30",
"author_id":"def_456",
"id":"4585",
"id_number":90
}
]
必需输出:
new_data = [
{
"time":[
"2015/03/27",
"2012/03/30"
],
"author_id":[
"abc_123",
"def_456"
],
"id":"4585",
"id_number":[
123,
90
]
},
{
"time":"2015/03/30",
"author_id":"abc_123",
"id":"7776",
"id_number":122
},
{
"time":"2015/03/27 05:22:42",
"author_id":"abc_123",
"id":"8449",
"id_number":111
}
]
答案 0 :(得分:0)
第一步可以是通过将id映射到字典来创建更规则的结构,其中所有键都映射到相应值的列表,并合并具有相同id值的原始字典。
然后在第二步中创建结果列表,方法是将id的值作为合并的字典映射,并决定值列表的长度,以便复制字典或从复制时的值中取出唯一的元素。就是这样。
#!/usr/bin/env python
# coding: utf8
from __future__ import absolute_import, division, print_function
from collections import defaultdict
from functools import partial
from pprint import pprint
def main():
records = [
{
'time': '2015/03/27',
'author_id': 'abc_123',
'id': '4585',
'id_number': 123
},
{
'time': '2015/03/30',
'author_id': 'abc_123',
'id': '7776',
'id_number': 122
},
{
'time': '2015/03/22',
'author_id': 'abc_123',
'id': '8449',
'id_number': 111
},
{
'time': '2012/03/30',
'author_id': 'def_456',
'id': '4585',
'id_number': 90
}
]
id2record = defaultdict(partial(defaultdict, list))
for record in records:
merged_record = id2record[record['id']]
for key, value in record.iteritems():
merged_record[key].append(value)
result = list()
for record in id2record.itervalues():
if len(record['id']) == 1:
result.append(dict((k, vs[0]) for k, vs in record.iteritems()))
else:
record['id'] = record['id'][0]
result.append(dict(record))
pprint(result)
if __name__ == '__main__':
main()
如果您可以更改输出的要求,我建议摆脱值的不规则性。处理结果的代码必须处理这两种情况 - 单值和带值的列表/数组 - 这使得它比它必须要复杂一点。
更新:修复了代码中的问题。 id 值应始终为单个值,而不是列表。