我有一个列表,列表中的元素是字典类型。
例如,
da_list = [
{'Surface':'APPLE','BaseForm':'apple','PN':0.5},
{'Surface':'BANANA','BaseForm':'banana','PN':0.4},
{'Surface':'ORANGE','BaseForm':'orange','PN':-0.1},
{'Surface':'APPLE','BaseForm':'apple','PN':0.5},
{'Surface':'BANANA','BaseForm':'banana','PN':0.4}
]
我想用名称db_list定义一个新列表。 db_list存储dict元素像这样:
db_list = [
{'Surface':'APPLE','BaseForm':'apple','PN':0.5,'Frequency':2},
{'Surface':'BANANA','BaseForm':'banana','PN':0.4,'Frequency':2},
{'Surface':'ORANGE','BaseForm':'orange','PN':-0.1,'Frequency':1}
]
db_list
删除了da_list
中重复的元素,并增加了每个字典的出现频率。
该怎么做?
答案 0 :(得分:3)
您可以使用itertools.groupby
:
import itertools
da_list = [{'Surface':'APPLE','BaseForm':'apple','PN':0.5}, {'Surface':'BANANA','BaseForm':'banana','PN':0.4}, {'Surface':'ORANGE','BaseForm':'orange','PN':-0.1}, {'Surface':'APPLE','BaseForm':'apple','PN':0.5}, {'Surface':'BANANA','BaseForm':'banana','PN':0.4}]
new_result = [list(b) for _, b in itertools.groupby(sorted(da_list, key=lambda x:x['Surface']), key=lambda x:x['Surface'])]
final_result = [{**i[0], 'Frequency':len(i)} for i in new_result]
输出:
[{'Surface': 'APPLE', 'BaseForm': 'apple', 'PN': 0.5, 'Frequency': 2}, {'Surface': 'BANANA', 'BaseForm': 'banana', 'PN': 0.4, 'Frequency': 2}, {'Surface': 'ORANGE', 'BaseForm': 'orange', 'PN': -0.1, 'Frequency': 1}]
答案 1 :(得分:2)
您还可以将Counter
与list comprehension
一起使用
from collections import Counter
>>> [dict(k + (('frequency', v),)) for k,v in Counter(tuple(k.items()) for k in da_list).items()]
[{'Surface': 'APPLE', 'BaseForm': 'apple', 'PN': 0.5, 'frequency': 2},
{'Surface': 'BANANA', 'BaseForm': 'banana', 'PN': 0.4, 'frequency': 2},
{'Surface': 'ORANGE', 'BaseForm': 'orange', 'PN': -0.1, 'frequency': 1}]