假设我有一个字典列表(每个字典具有相同的键),如下所示:
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
我只需要合并正文,标题和注释部分,并返回一个字典,就像这样:
{'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': None, 'Comments': 'Dallas. Austin Boston'}
请注意,标题为“无”。因此,我们必须在那儿小心。这是我到目前为止所做的...但是,在某个地方失败了...我看不到哪里...
keys = set().union(*list_of_dicts)
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: " ".join(str(dic.get(k, '')) for dic in list_of_dicts) for k in keys if k != 'Id'}
merged_dict = {'Id': k_value}
merged_dict.update(d_dict)
但是,上面的代码返回了这个...我不喜欢它:
Final Merged Dict: {'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': 'None None None', 'Comments': 'Dallas. Austin Boston'}
答案 0 :(得分:1)
首先,我从Id
中删除了keys
,以避免在字典理解中不得不跳过它,而是使用简单的赋值而不是.update()
。
在join
的参数中,当dic[k]
为None时将其过滤掉。如果join
的结果为空字符串(因为所有值均为None
),请在最终结果中将其转换为None
。
keys = set().union(*list_of_dicts)
keys.remove('Id')
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: (" ".join(str(dic[k]) for dic in list_of_dicts if k in dic and dic[k] is not None) or None) for k in keys}
d_dict['Id'] = k_value
print(d_dict)
答案 1 :(得分:1)
解析字典列表时,可以将中间结果存储在defaultdict
对象中,以保存字符串值列表。解析完所有词典后,您就可以将字符串连接在一起了。
from collections import defaultdict
dd_body = defaultdict(list)
dd_comments = defaultdict(list)
dd_titles = defaultdict(list)
for row in list_of_dicts:
dd_body[row['Id']].append(row['Body'])
dd_comments[row['Id']].append(row['Comments'])
dd_titles[row['Id']].append(row['Title'] or '') # Effectively removes `None`.
result = []
for id_ in dd_body: # All three dictionaries have the same keys.
body = ' '.join(dd_body[id_]).strip()
comments = ' '.join(dd_comments[id_]).strip()
titles = ' '.join(dd_titles[id_]).strip() or None
result.append({'Id': id_, 'Body': body, 'Title': titles, 'Comments': comments})
>>> result
[{'Id': 4726,
'Body': 'Hello from John Hello from Mary Hello from Dylan',
'Title': None,
'Comments': 'Dallas. Austin Boston'}]
答案 2 :(得分:0)
Pythonic少,然后提供其他答案,但我想认为这很容易理解。
body, title, comments = "", "", ""
list_of_dicts=[
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
id = list_of_dicts[0]['Id']
for dict in list_of_dicts:
if dict['Body'] is not None:
body=body + dict['Body']
if dict['Title'] is not None:
title=title + dict['Title']
if dict ['Comments'] is not None:
comments=comments + dict['Comments']
if title == "":
title = None
if body == "":
body = None
if comments == "":
comments = None
record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}
如果只有“标题”字段可以选择“无”,则可以通过取消选中其他字段来缩短它。
body, title, comments = "", "", ""
list_of_dicts=[
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"}]
id = list_of_dicts[0]['Id']
for dict in list_of_dicts:
body=body + dict['Body']
comments=comments + dict['Comments']
if dict['Title'] is not None:
title=title + dict['Title']
if title == "":
title = None
record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}
答案 3 :(得分:0)
对于这种类型的数据操作,pandas
是您的朋友。
import pandas as pd
# Your list of dictionaries.
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
# Can be read into a pandas dataframe
df = pd.DataFrame(list_of_dicts)
# Do a database style groupby() and apply the function that you want to each group
group_transformed_df = df.groupby('Id').agg(lambda x: ' '.join(x)).reset_index() # I do reset_index to get a normal DataFrame back.
# DataFrame() -> dict
output_dict = group_transformed_df.to_dict('records')
您可以从DataFrame获得许多类型的字典。您需要records
选项。