Python-将字典列表重新组合为两个嵌套的字典列表吗?

时间:2019-05-08 10:04:07

标签: python dictionary

我有一个包含匹配站点和匹配设备的字典列表,我想按站点然后按设备重新组合这些字典。

我添加了示例输出字典和所需的字典。

我以为我可以使用itertools来完成多个工作组,但确实有这些组,但是我不确定如何将其全部合并或者这是否是最有效的方法

itertools尝试:

site_groups = itertools.groupby(bgp_data_query, lambda i: i['location'])
for key, site in site_groups:
    device_groups = itertools.groupby(site, lambda i: i['device_name'])
    for key, device in site_groups:

原始数据

[
    {
        "bgp_peer_as": "1",
        "bgp_session": "3:35",
        "bgp_routes": "0",
        "service_status": "Down",
        "location": "London",
        "circuit_name": "MPLS",
        "device_name": "LON-EDGE",
        "timestamp" : "2019-5-8 12:30:00"
    },
    {
        "bgp_peer_as": "3",
        "bgp_session": "4:25",
        "bgp_routes": "100",
        "service_status": "UP",
        "location": "London",
        "circuit_name": "MPLS 02",
        "device_name": "LON-EDGE",
        "timestamp" : "2019-5-8 12:30:00"
    },    
    {
        "bgp_peer_as": "18",
        "bgp_session": "1:25",
        "bgp_routes": "1",
        "service_status": "UP",
        "location": "London",
        "circuit_name": "INTERNET",
        "device_name": "LON-INT-GW",
        "timestamp" : "2019-5-8 12:31:00"
    },  
    {
        "bgp_peer_as": "20",
        "bgp_session": "1:25",
        "bgp_routes": "1",
        "service_status": "UP",
        "location": "Manchester",
        "circuit_name": "INTERNET",
        "device_name": "MAN-INT-GW",
        "timestamp" : "2019-5-8 12:20:00"
    },     
    {
        "bgp_peer_as": "20",
        "bgp_session": "1:25",
        "bgp_routes": "1",
        "service_status": "UP",
        "location": "Manchester",
        "circuit_name": "INTERNET 02",
        "device_name": "MAN-INT-GW",
        "timestamp" : "2019-5-8 12:20:00"
    }, 
    {
        "bgp_peer_as": "45",
        "bgp_session": "1:25",
        "bgp_routes": "1",
        "service_status": "UP",
        "location": "Manchester",
        "circuit_name": "MPLS 01",
        "device_name": "MAN-EDGE",
        "timestamp" : "2019-5-8 12:21:00"
    },             
]

希望的字典

[
    { 
    "London": { 
        "LON-EDGE": {
            "bgp_peer_as": "1",
            "bgp_session": "3:35",
            "bgp_routes": "0",
            "service_status": "DOWN",
            "circuit_name": "MPLS",
            },
            {
            "bgp_peer_as": "1",
            "bgp_session": "4:25",
            "bgp_routes": "100",
            "service_status": "UP",
            "circuit_name": "MPLS 02",
            }
        },
        { 
        "LON-INT-GW" : {
            "bgp_peer_as": "18",
            "bgp_session": "1:25",
            "bgp_routes": "1",
            "service_status": "UP",
            "circuit_name": "INTERNET",
            }
        }
    }
],
[
    { 
    "Manchester": { 
        "MAN-EDGE": {
            "bgp_peer_as": "45",
            "bgp_session": "1:25",
            "bgp_routes": "1",
            "service_status": "UP",
            "circuit_name": "MPLS 01",
            }
        },
        {
        "MAN-INT-GW": {
            "bgp_peer_as": "20",
            "bgp_session": "1:25",
            "bgp_routes": "1",
            "service_status": "UP",
            "circuit_name": "INTERNET",
            },
            {
            "bgp_peer_as": "20",
            "bgp_session": "1:25",
            "bgp_routes": "1",
            "service_status": "UP",
            "circuit_name": "INTERNET 02",
            }
        }
    }
]

2 个答案:

答案 0 :(得分:2)

使用双collections.defaultdict并在最深层使用列表,并在项目上循环,弹出“键”,这样它们就不会出现在最终数据中:

result = collections.defaultdict(lambda :collections.defaultdict(list))

for d in raw_dict:
    location = d.pop("location")
    device_name = d.pop("device_name")
    result[location][device_name].append(d)

对您的数据进行处理(转储为json,以摆脱特殊命令的表示形式):

import json
print(json.dumps(result,indent=4))

{
    "Manchester": {
        "MAN-INT-GW": [
            {
                "bgp_routes": "1",
                "service_status": "UP",
                "bgp_peer_as": "20",
                "circuit_name": "INTERNET",
                "bgp_session": "1:25"
            },
            {
                "bgp_routes": "1",
                "service_status": "UP",
                "bgp_peer_as": "20",
                "circuit_name": "INTERNET 02",
                "bgp_session": "1:25"
            }
        ],
        "MAN-EDGE": [
            {
                "bgp_routes": "1",
                "service_status": "UP",
                "bgp_peer_as": "45",
                "circuit_name": "MPLS 01",
                "bgp_session": "1:25"
            }
        ]
    },
    "London": {
        "LON-EDGE": [
            {
                "bgp_routes": "0",
                "service_status": "Down",
                "bgp_peer_as": "1",
                "circuit_name": "MPLS",
                "bgp_session": "3:35"
            },
            {
                "bgp_routes": "100",
                "service_status": "UP",
                "bgp_peer_as": "3",
                "circuit_name": "MPLS 02",
                "bgp_session": "4:25"
            }
        ],
        "LON-INT-GW": [
            {
                "bgp_routes": "1",
                "service_status": "UP",
                "bgp_peer_as": "18",
                "circuit_name": "INTERNET",
                "bgp_session": "1:25"
            }
        ]
    }
}

请注意,基于itertools.groupby的解决方案也可以使用,但是仅当相同的键为连续时。否则,它会创建多个组,而不是您想要的组。

答案 1 :(得分:1)

可以与defaultdict一起使用itertools.groupby

import itertools
from collections import defaultdict
res = defaultdict(dict)

for x, g in itertools.groupby(bgp_data_query, key=lambda x: x["location"]):
    for d, f in itertools.groupby(g, key=lambda x: x["device_name"]):
        res[x][d] = [{k:v}  for z in f for k, v in z.items() if k not in {"location", "device_name"}]

print(dict(res))

输出:

{'London': {'LON-EDGE': [{'bgp_peer_as': '1'},
   {'bgp_routes': '0'},
   {'circuit_name': 'MPLS'},
   {'bgp_session': '3:35'},
   {'service_status': 'Down'},
   {'bgp_peer_as': '3'},
   {'bgp_routes': '100'},
   {'circuit_name': 'MPLS 02'},
   {'bgp_session': '4:25'},
   {'service_status': 'UP'}],
  'LON-INT-GW': [{'bgp_peer_as': '18'},
   {'bgp_routes': '1'},
   {'circuit_name': 'INTERNET'},
   {'bgp_session': '1:25'},
   {'service_status': 'UP'}]},
 'Manchester': {'MAN-EDGE': [{'bgp_peer_as': '45'},
   {'bgp_routes': '1'},
   {'circuit_name': 'MPLS 01'},
   {'bgp_session': '1:25'},
   {'service_status': 'UP'}],
  'MAN-INT-GW': [{'bgp_peer_as': '20'},
   {'bgp_routes': '1'},
   {'circuit_name': 'INTERNET'},
   {'bgp_session': '1:25'},
   {'service_status': 'UP'},
   {'bgp_peer_as': '20'},
   {'bgp_routes': '1'},
   {'circuit_name': 'INTERNET 02'},
   {'bgp_session': '1:25'},
   {'service_status': 'UP'}]}}