拼合其他词典列表的词典字典

时间:2015-12-09 22:22:42

标签: python dictionary

我有这个高度嵌套的字典树:

sample = {"name": "one",
          "id": "1",
          "children": [{"name": "two",
                        "id": "2",
                        "children": [{"name": "six",
                                      "id": "6",
                                      "children": []}, 
                                     {"name": "seven",
                                      "id": "7",
                                      "children": []}]},
                       {"name": "three",
                        "id": "3",
                        "children": []},
                       {"name": "four",
                        "id": "4",
                        "children": []}, 
                       {"name": "five",
                        "id": "5",
                        "children": []}]}

这只是一个例子,实际上有7或8个级别的儿童名单......每个名字和身份证都是独一无二的。

我的目标是将此树展平为一个字典,其中所有名称键的值都作为键,而它们的ID作为第二个键值对:

sample = {"one": {"id":"1"},
          "two": {"id":"2"},
          "three": {"id": "3"}, ...}

实际上有更多的键值对,但我只对名称及其相关的id感兴趣。

我试着把它包裹起来,但我的递归技巧不是很好,很难过,所以我请求你帮忙。我也搜索了类似的问题,但是dicts被封装在列表中这一事实使得它无法真正比​​较,对我来说无论如何......

我想出了一个解决我问题的解决方案,但这是我写的最骇客和最丑陋的代码,我为此感到羞耻。基本上我将dict转换为字符串表示法并使用正则表达式找到我的对!这很糟糕,但我不得不做一些原型,以便有时间处理其他问题...

所以任何想法,伙计们?

2 个答案:

答案 0 :(得分:3)

你可以像这样做一个递归函数(假设每个dicts都是正确结构化的):

def flatten(source, target):
    target[source["name"]] = {"id": source["id"]}
    for child in source["children"]:
        flatten(child, target)

样品:

>>> d = {}
>>> flatten(sample, d)
>>> d
{'seven': {'id': '7'}, 'six': {'id': '6'}, 'three': {'id': '3'}, 'two': {'id': '2'}, 'four': {'id': '4'}, 'five': {'id': '5'}, 'one': {'id': '1'}}

或者像这样,如果你不喜欢将目标字典作为参数传递:

def flatten(source):
    d = {source["name"]: {"id": source["id"]}}
    for child in source["children"]:
        d.update(flatten(child))
    return d

样品:

>>> flatten(sample)
{'one': {'id': '1'}, 'four': {'id': '4'}, 'seven': {'id': '7'}, 'five': {'id': '5'}, 'six': {'id': '6'}, 'three': {'id': '3'}, 'two': {'id': '2'}}

您还可以将输出简化为简单的非嵌套字典:

def flatten(source):
    d = {source["name"]: source["id"]}
    for child in source["children"]:
        d.update(flatten(child))
    return d

>>> flatten(sample)
{'one': '1', 'four': '4', 'seven': '7', 'five': '5', 'six': '6', 'three': '3', 'two': '2'}

答案 1 :(得分:2)

如果您的数据实际上比您的示例更复杂:

def rec_get(d, k):
    if isinstance(d, dict):
        if k in d:
            yield (d[k], {"id": d["id"]})
        for v in d.values():
            yield from rec_get(v, k)
    elif isinstance(d, list):
        for v in d:
            yield from rec_get(v, k)
print(dict(rec_get(sample ,"name")))

输出:

{'five': {'id': '5'}, 'six': {'id': '6'}, 'four': {'id': '4'}, 'one': {'id': '1'}, 'three': {'id': '3'}, 'two': {'id': '2'}, 'seven': {'id': '7'}}

如果你想要一个更通用的功能,你可以这样:

from collections import OrderedDict
from collections import Iterable


def rec_get(d, **kwargs):
    if isinstance(d, dict):
        yield from ((d[k], k) for k in kwargs.keys() & d)
        for v in d.values():
            yield from rec_get(v, **kwargs)
    elif isinstance(d, Iterable) and not isinstance(d, str):
        for v in d:
            yield from rec_get(v, **kwargs)

这只是使用您的输入和传递关键字参数的示例:

print(list(rec_get(sample, name="name", id="id")))

输出:

[('1', 'id'), ('one', 'name'), ('2', 'id'), ('two', 'name'), 
('6', 'id'), ('six', 'name'), ('7', 'id'), ('seven', 'name'), 
('3', 'id'), ('three', 'name'), ('4', 'id'), ('four', 'name'), ('5', 'id'), ('five', 'name')]