我是一名Python初学者,正在努力解决以下问题:
我正在尝试将多个列表与我从多个json解码的嵌套字典合并。列表之间的公共线程是与名称对应的每个嵌套dict的“ uid”键,但是问题是某些dict的键名不同。例如,代替“ uid”,字典可以具有“数字”作为键。我想将它们合并成一个超级的嵌套字典列表。为了说明这一点,我拥有的是:
data.set_index(['Site', 'Storage','Commodity'], append=True).unstack('parameter')
*** KeyError: 'Level parameter not found'
我想结束的是:
masterlist = [ ]
listA = [{"uid": "12345", "name": "John Smith"}, {etc...}]
listB = [{"number": "12345", "person": "John Smith", "val1": "25"}, {etc...}]
listC = [{"number": "12345", "person": "John Smith", "val2": "65"}, {etc...}]
是否可以通过迭代并比较相同的“ uid”值来有效/ Python化地进行操作?我已经看到了很多通过匹配键进行合并的方法,但是这里的问题显然是键不一致。排序无关紧要。我需要的是主列表包含每个dict条目的相应uid,名称和值。希望这是有道理的,谢谢!
答案 0 :(得分:2)
可能有一些使用基本python的解决方案,但我能想到的最简单的方法是使用pandas库将每个列表转换为DataFrame,然后将它们合并/合并。
import pandas as pd
dfA = pd.DataFrame(listA)
dfB = pd.DataFrame(listB)
merged_df = dfA.merge(dfB, left_on='uid', right_on='number')
这将返回一个DataFrame,其中包含比您需要的列更多的列(即,“ uid”和“ number”都有列),但是您可以通过这种方式指定所需的列和所需的顺序:>
merged_df = merged_df[['uid', 'name', 'val1']]
要将多个DataFrame合并到一个主框架中,请参见:pandas three-way joining multiple dataframes on columns
答案 1 :(得分:0)
您应该将所有输入列表放在列表列表中,以便可以构建将Controlled
映射到具有聚合项值的字典的dict,以便所需的dict列表将仅仅是dict映射的值。为了允许在不同的输入字典中对键进行不一致的命名,请uid
不需要的输入字典(例如我的示例中的pop
和number
),然后将其分配给字典您想要保留的密钥(例如示例中的id
)
uid
所以给定:
wanted_key = 'uid'
unwanted_keys = {'number', 'id'}
mapping = {}
for l in lists:
for d in l:
if wanted_key not in d:
d[wanted_key] = d.pop(unwanted_keys.intersection(d).pop())
mapping.setdefault(d[wanted_key], {}).update(d)
masterlist = list(mapping.values())
lists = [
[
{"uid": "12345", "name": "John Smith"},
{"uid": "56789", "name": "Joe Brown", "val1": "1"}
],
[
{"number": "12345", "name": "John Smith", "val1": "25"},
{"number": "56789", "name": "Joe Brown", "val2": "2"}
],
[
{"id": "12345", "name": "John Smith", "val2": "65"}
]
]
变为:
masterlist
答案 2 :(得分:0)
如果您需要为每个列表使用不同的键,则以下解决方案也使用中间dict
,该函数具有接受代表uid
的键和一个或多个键进行复制的功能:
people_by_uid = {person["uid"]: person for person in listA}
def update_values(listX, uid_key, *val_keys):
for entry in listX:
person = people_by_uid[entry[uid_key]]
for val_key in val_keys:
person[val_key] = entry[val_key]
update_values(listB, "number", "val1")
update_values(listC, "number", "val2")
# e.g. if you had a listD from which you also needed val3 and val4:
update_values(listD, "number", "val3", "val4")
masterlist = [person for person in people_by_uid.values()]
答案 3 :(得分:0)
您可以使用列表理解功能在不使用Pandas的情况下执行此操作,该列表构建方法是建立词典字典,以列表的词典按其“ uid”分组。然后,您使用该分组字典的.values()再次获得字典列表:
listA = [{"uid": "12345", "name": "John Smith"},{"uid": "67890", "name": "Jane Doe"}]
listB = [{"number": "12345", "person": "John Smith", "val1": "25"},{"number": "67890", "val1": "37"}]
listC = [{"number": "12345", "person": "John Smith", "val2": "65"},{"number": "67890", "val2": "53"}]
from collections import defaultdict
fn = { "number":"uid", "person":"name" } # map to get uniform key names
data = [ { fn.get(k,k):v for k,v in d.items() } for d in listA+listB+listC ]
result = next(r for r in [defaultdict(dict)] if [r[d["uid"]].update(d) for d in data])
print(*result.values())
{'uid': '12345', 'name': 'John Smith', 'val1': '25', 'val2': '65'}
{'uid': '67890', 'name': 'Jane Doe', 'val1': '37', 'val2': '53'}