我有这个高度嵌套的字典树:
sample = {"name": "one",
"id": "1",
"children": [{"name": "two",
"id": "2",
"children": [{"name": "six",
"id": "6",
"children": []},
{"name": "seven",
"id": "7",
"children": []}]},
{"name": "three",
"id": "3",
"children": []},
{"name": "four",
"id": "4",
"children": []},
{"name": "five",
"id": "5",
"children": []}]}
这只是一个例子,实际上有7或8个级别的儿童名单......每个名字和身份证都是独一无二的。
我的目标是将此树展平为一个字典,其中所有名称键的值都作为键,而它们的ID作为第二个键值对:
sample = {"one": {"id":"1"},
"two": {"id":"2"},
"three": {"id": "3"}, ...}
实际上有更多的键值对,但我只对名称及其相关的id感兴趣。
我试着把它包裹起来,但我的递归技巧不是很好,很难过,所以我请求你帮忙。我也搜索了类似的问题,但是dicts被封装在列表中这一事实使得它无法真正比较,对我来说无论如何......
我想出了一个解决我问题的解决方案,但这是我写的最骇客和最丑陋的代码,我为此感到羞耻。基本上我将dict转换为字符串表示法并使用正则表达式找到我的对!这很糟糕,但我不得不做一些原型,以便有时间处理其他问题...
所以任何想法,伙计们?
答案 0 :(得分:3)
你可以像这样做一个递归函数(假设每个dicts都是正确结构化的):
def flatten(source, target):
target[source["name"]] = {"id": source["id"]}
for child in source["children"]:
flatten(child, target)
样品:
>>> d = {}
>>> flatten(sample, d)
>>> d
{'seven': {'id': '7'}, 'six': {'id': '6'}, 'three': {'id': '3'}, 'two': {'id': '2'}, 'four': {'id': '4'}, 'five': {'id': '5'}, 'one': {'id': '1'}}
或者像这样,如果你不喜欢将目标字典作为参数传递:
def flatten(source):
d = {source["name"]: {"id": source["id"]}}
for child in source["children"]:
d.update(flatten(child))
return d
样品:
>>> flatten(sample)
{'one': {'id': '1'}, 'four': {'id': '4'}, 'seven': {'id': '7'}, 'five': {'id': '5'}, 'six': {'id': '6'}, 'three': {'id': '3'}, 'two': {'id': '2'}}
您还可以将输出简化为简单的非嵌套字典:
def flatten(source):
d = {source["name"]: source["id"]}
for child in source["children"]:
d.update(flatten(child))
return d
>>> flatten(sample)
{'one': '1', 'four': '4', 'seven': '7', 'five': '5', 'six': '6', 'three': '3', 'two': '2'}
答案 1 :(得分:2)
如果您的数据实际上比您的示例更复杂:
def rec_get(d, k):
if isinstance(d, dict):
if k in d:
yield (d[k], {"id": d["id"]})
for v in d.values():
yield from rec_get(v, k)
elif isinstance(d, list):
for v in d:
yield from rec_get(v, k)
print(dict(rec_get(sample ,"name")))
输出:
{'five': {'id': '5'}, 'six': {'id': '6'}, 'four': {'id': '4'}, 'one': {'id': '1'}, 'three': {'id': '3'}, 'two': {'id': '2'}, 'seven': {'id': '7'}}
如果你想要一个更通用的功能,你可以这样:
from collections import OrderedDict
from collections import Iterable
def rec_get(d, **kwargs):
if isinstance(d, dict):
yield from ((d[k], k) for k in kwargs.keys() & d)
for v in d.values():
yield from rec_get(v, **kwargs)
elif isinstance(d, Iterable) and not isinstance(d, str):
for v in d:
yield from rec_get(v, **kwargs)
这只是使用您的输入和传递关键字参数的示例:
print(list(rec_get(sample, name="name", id="id")))
输出:
[('1', 'id'), ('one', 'name'), ('2', 'id'), ('two', 'name'),
('6', 'id'), ('six', 'name'), ('7', 'id'), ('seven', 'name'),
('3', 'id'), ('three', 'name'), ('4', 'id'), ('four', 'name'), ('5', 'id'), ('five', 'name')]