Question

假设我有一个字典列表（每个字典具有相同的键），如下所示：

list_of_dicts = [
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

我只需要合并正文，标题和注释部分，并返回一个字典，就像这样：

{'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': None, 'Comments': 'Dallas. Austin Boston'}

请注意，标题为“无”。因此，我们必须在那儿小心。这是我到目前为止所做的...但是，在某个地方失败了...我看不到哪里...

    keys = set().union(*list_of_dicts)
    print(keys)
    k_value = list_of_dicts[0]['Id']
    d_dict = {k: " ".join(str(dic.get(k, '')) for dic in list_of_dicts) for k in keys if k != 'Id'}

    merged_dict = {'Id': k_value}
    merged_dict.update(d_dict)

但是，上面的代码返回了这个...我不喜欢它：

Final Merged Dict: {'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': 'None None None', 'Comments': 'Dallas. Austin Boston'}

Answer 1

首先，我从Id中删除了keys，以避免在字典理解中不得不跳过它，而是使用简单的赋值而不是.update()。

在join的参数中，当dic[k]为None时将其过滤掉。如果join的结果为空字符串（因为所有值均为None），请在最终结果中将其转换为None。

keys = set().union(*list_of_dicts)
keys.remove('Id')
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: (" ".join(str(dic[k]) for dic in list_of_dicts if k in dic and dic[k] is not None) or None) for k in keys}
d_dict['Id'] = k_value

print(d_dict)

DEMO

Answer 2

解析字典列表时，可以将中间结果存储在defaultdict对象中，以保存字符串值列表。解析完所有词典后，您就可以将字符串连接在一起了。

from collections import defaultdict

dd_body = defaultdict(list)
dd_comments = defaultdict(list)
dd_titles = defaultdict(list)

for row in list_of_dicts:
    dd_body[row['Id']].append(row['Body'])
    dd_comments[row['Id']].append(row['Comments'])
    dd_titles[row['Id']].append(row['Title'] or '')  # Effectively removes `None`.

result = []
for id_ in dd_body:  # All three dictionaries have the same keys.
    body = ' '.join(dd_body[id_]).strip()
    comments = ' '.join(dd_comments[id_]).strip()
    titles = ' '.join(dd_titles[id_]).strip() or None
    result.append({'Id': id_, 'Body': body, 'Title': titles, 'Comments': comments})
>>> result
[{'Id': 4726,
  'Body': 'Hello from John Hello from Mary Hello from Dylan',
  'Title': None,
  'Comments': 'Dallas.  Austin Boston'}]

Answer 3

Pythonic少，然后提供其他答案，但我想认为这很容易理解。

body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    if dict['Body'] is not None:
        body=body + dict['Body']

    if dict['Title'] is not None:
        title=title + dict['Title']

    if dict ['Comments'] is not None:
        comments=comments + dict['Comments']

if title == "":
    title = None

if body == "":
    body = None

if comments == "":
    comments = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}

如果只有“标题”字段可以选择“无”，则可以通过取消选中其他字段来缩短它。

body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"}]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    body=body + dict['Body']
    comments=comments + dict['Comments']

    if dict['Title'] is not None:
        title=title + dict['Title']

if title == "":
    title = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}

Answer 4

对于这种类型的数据操作，pandas是您的朋友。

import pandas as pd

# Your list of dictionaries.
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

# Can be read into a pandas dataframe
df = pd.DataFrame(list_of_dicts)

# Do a database style groupby() and apply the function that you want to each group
group_transformed_df = df.groupby('Id').agg(lambda x: ' '.join(x)).reset_index() # I do reset_index to get a normal DataFrame back.

# DataFrame() -> dict
output_dict = group_transformed_df.to_dict('records')

您可以从DataFrame获得许多类型的字典。您需要records选项。

在多个字典中连接字符串值

4 个答案: