python字典上的选择性合并

时间:2018-10-06 13:02:27

标签: python dictionary merge compare

我在python(d1,d2)中有2个字典,我需要将缺少的“ id”项从d2传递到d1,而忽略其他任何区别(例如d1中的额外“子”)。有效地需要的是结果字典只是d1,其中添加了“ id”项。我已经尝试过合并,但是由于丢失数据的任何一种方式都无法正常工作。

   
d1 = {
    "parent": {
        "name": "Axl",
        "surname": "Doe",
        "children": [
            {
                "name": "John",
                "surname": "Doe"
            },                
            {
                "name": "Jane",
                "surname": "Doe",
                "children": [
                    {
                        "name": "Jim",
                        "surname": "Doe"
                    },
                    {
                        "name": "Kim",
                        "surname": "Doe"
                    }
                ]
            }
        ]
    }
}

d2 = {
    "parent": {
        "id": 1,
        "name": "Axl",
        "surname": "Doe",
        "children": [
            {
                "id": 2,
                "name": "John",
                "surname": "Doe"
            },
            {
                "id": 3,
                "name": "Jane",
                "surname": "Doe",
                "children": [
                    {
                        "id": 4,
                        "name": "Jim",
                        "surname": "Doe"
                    },
                    {
                        "id": 5,
                        "name": "Kim",
                        "surname": "Doe"
                    },
                    {
                        "id": 6
                        "name": "Bill",
                        "surname": "Doe"
                    },
                ]
            }
        ]
    }
}

result = {
"parent": {
    "id": 1,
    "name": "Axl",
    "surname": "Doe",
    "children": [
        {
            "id": 2,
            "name": "John",
            "surname": "Doe"
        },
        {
            "id": 3,
            "name": "Jane",
            "surname": "Doe",
            "children": [
                {
                    "id": 4,
                    "name": "Jim",
                    "surname": "Doe"
                },
                {
                    "id": 5,
                    "name": "Kim",
                    "surname": "Doe"
                }
            ]
        }
    ]
}

}

有什么想法吗?

2 个答案:

答案 0 :(得分:2)

我根据键功能(在本例中为“名称”和“姓”属性)匹配子项。

然后,我遍历id_lookup字典(在您的示例中命名为d2),并尝试将每个孩子与main_dict的孩子匹配。如果找到匹配项,则递归进行匹配。

最后,main_dict(或您的示例中的d1)填充了ID:-)

import operator

root = main_dict["parent"]
lookup_root = id_lookup_dict["parent"]

keyfunc = operator.itemgetter("name", "surname")

def _recursive_fill_id(root, lookup_root, keyfunc):
    """Recursively fill root node with IDs

    Matches nodes according to keyfunc
    """
    root["id"] = lookup_root["id"]

    # Fetch children
    root_children = root.get("children")

    # There are no children
    if root_children is None:
        return

    children_left = len(root_children)

    # Create a dict mapping the key identifying a child to the child
    # This avoids a hefty lookup cost and requires a single iteration.
    children_dict = dict(zip(map(keyfunc, root_children), root_children))

    for lookup_child in lookup_root["children"]:
        lookup_key = keyfunc(lookup_child)
        matching_child = children_dict.get(lookup_key)

        if matching_child is not None:
            _recursive_fill_id(matching_child, lookup_child, keyfunc)

            # Short circuit in case all children were filled
            children_left -= 1
            if not children_left:
                break

_recursive_fill_id(root, lookup_root, keyfunc)

答案 1 :(得分:1)

我希望添加一个迭代答案而不是递归答案,因为它可能会被证明更有效。

它将不会达到任何堆栈阈值,并且会更快一点:

import operator

root = main_dict["parent"]
lookup_root = id_lookup_dict["parent"]

keyfunc = operator.itemgetter("name", "surname")

def _recursive_fill_id(root, lookup_root, keyfunc):
    """Recursively fill root node with IDs

    Matches nodes according to keyfunc
    """
    matching_nodes = [(root, lookup_root)]

    while matching_nodes:
        root, lookup_root = matching_nodes.pop()
        root["id"] = lookup_root["id"]

        # Fetch children
        root_children = root.get("children")

        # There are no children
        if root_children is None:
            continue

        children_left = len(root_children)

        # Create a dict mapping the key identifying a child to the child
        # This avoids a hefty lookup cost and requires a single iteration.
        children_dict = dict(zip(map(keyfunc, root_children), root_children))

        for lookup_child in lookup_root["children"]:
            lookup_key = keyfunc(lookup_child)
            matching_child = children_dict.get(lookup_key)

            if matching_child is not None:
                matching_nodes.append((matching_child, lookup_child))

                # Short circuit in case all children were filled
                children_left -= 1
                if not children_left:
                    break


_recursive_fill_id(root, lookup_root, keyfunc)