我有这样的字典/键值对列表:
list = [{'mid': 123, 'msg': 'sometext', 'antivirus': 'positive'},
{'mid': 123, 'msg': 'sometext2', 'antivirus': 'positive'},
{'mid': 456, 'msg': 'sometext3', 'antivirus': 'positive'},
{'mid': 456, 'msg': 'sometext4', 'antivirus': 'positive'},
{'mid': 789, 'msg': 'sometext5', 'antivirus': 'positive'}]
我希望结果是字典的新列表(如果可能的话,以最有效的方式),将其按' mid '键的值进行分组:
result = [{'mid': 123, 'msg': 'sometext,sometext2', 'antivirus': 'positive,positive'},
{'mid': 456, 'msg': 'sometext3,sometext4', 'antivirus': 'positive,positive'},
{'mid': 789, 'msg': 'sometext5', 'antivirus': 'positive'}]
答案 0 :(得分:0)
对这种方法不是很兴奋,但是它将带您到那里。使用lst
对字典defaultdict
进行迭代,以mid
的值进行分组,然后对那个进行迭代以产生输出,并结合msg
和antivirus
键。
from collections import defaultdict
lst = [{'mid': 123, 'msg': 'sometext', 'antivirus': 'positive'},
{'mid': 123, 'msg': 'sometext2', 'antivirus': 'positive'},
{'mid': 456, 'msg': 'sometext3', 'antivirus': 'positive'},
{'mid': 456, 'msg': 'sometext4', 'antivirus': 'positive'},
{'mid': 789, 'msg': 'sometext5', 'antivirus': 'positive'}]
dd = defaultdict(list)
for d in lst:
key = d['mid']
dd[key].append(d)
output = []
for (k,v) in dd.items():
output.append({
'mid': k,
'msg': ','.join(x['msg'] for x in v),
'antivirus': ','.join(x['antivirus'] for x in v),
})
print(output)
[ {'mid': 123, 'msg': 'sometext,sometext2', 'antivirus': 'positive,positive'}, {'mid': 456, 'msg': 'sometext3,sometext4', 'antivirus': 'positive,positive'}, {'mid': 789, 'msg': 'sometext5', 'antivirus': 'positive'} ]
答案 1 :(得分:0)
您可以只使用pandas dataFrame:
import pandas as pd
lst = [{'mid': 123, 'msg': 'sometext', 'antivirus': 'positive'},
{'mid': 123, 'msg': 'sometext2', 'antivirus': 'positive'},
{'mid': 456, 'msg': 'sometext3', 'antivirus': 'positive'},
{'mid': 456, 'msg': 'sometext4', 'antivirus': 'positive'},
{'mid': 789, 'msg': 'sometext5', 'antivirus': 'positive'}]
d = (pd.DataFrame(lst)
.groupby(['mid'])
.agg(','.join)
.reset_index()
.to_dict('r'))
print (d)
输出:
[{'mid': 123, 'antivirus': 'positive,positive', 'msg': 'sometext,sometext2'},
{'mid': 456, 'antivirus': 'positive,positive', 'msg': 'sometext3,sometext4'},
{'mid': 789, 'antivirus': 'positive', 'msg': 'sometext5'}]
答案 2 :(得分:0)
将您的一个变量(list
)与内置变量相同是一个不好的主意,因此我在此处使用l
使用中间defaultdict:
from collections import defaultdict
intermediate = defaultdict(lambda: defaultdict(list))
for record in l:
mid = record["mid"]
for key, value in record.items():
if key == "mid":
continue
intermediate[mid][key].append(value)
result = [
{"mid": mid, **{key: ",".join(value) for key, value in attributes.items()}}
for mid, attributes in intermediate.items()
]
result
答案 3 :(得分:0)
(list是python中的关键字,因此我将名称更改为mylist) 这是您必须提供的一线服务:
import itertools; map(lambda sub: reduce(lambda a,b: { key : ",".join(set(filter(lambda x: x!='', [str(a.get(key, ''))] + [str(b.get(key, ''))]))) for key in set(a.keys() + b.keys()) }, sub, {}), map(lambda sub: list(sub[1]), itertools.groupby(mylist, lambda lst: lst['mid'])))
不那么令人讨厌:
import itertools
groups = map(lambda sub: list(sub[1]), itertools.groupby(mylist, lambda lst: lst['mid'])) # get the dicts organized into groups on key 'mid'
def joindicts(a,b):
result = dict()
for key in set(a.keys() + b.keys()): # get union of keys for both dicts
val_a = str(a.get(key, ''))
val_b = str(b.get(key, ''))
val = ','.join([x for x in [val_a] + [val_b] if x != ''])
result.update({key:val})
return result
map(lambda sub: reduce(joindicts, sub, {}), groups)