我有两个字典,其中包含以下示例数据:
列表1:
list_1 = [
{
"route": "10.10.4.0",
"mask": "255.255.255.0",
"next_hop": "172.18.1.5"
},
{
"route": "10.10.5.0",
"mask": "255.255.255.0",
"next_hop": "172.18.1.5"
},
{
"route": "10.10.8.0",
"mask": "255.255.255.0",
"next_hop": "172.16.66.34"
},
{
"route": "10.10.58.0",
"mask": "255.255.255.0",
"next_hop": "172.18.1.5"
},
{
"route": "172.18.12.4",
"mask": "255.255.255.252",
"next_hop": "172.18.1.5"
}
]
列表2
list_2 = [
{
"route": "10.10.4.0",
"site": "Edinburgh"
},
{
"route": "10.10.8.0",
"site": "Manchester"
},
{
"route": "10.10.5.0",
"site": "London"
},
]
我按照下面的顺序使用这些列表项
temp_merged_data = sorted(itertools.chain(list_1, list_2), key=lambda x:x['route'])
route_data = []
for k,v in itertools.groupby(temp_merged_data, key=lambda x:x['route']):
d = {}
for dct in v:
d.update(dct)
route_data.append(d)
哪个返回以下内容,但是我不希望其中没有站点的任何路由,我将如何实现?并且当我拥有了dictionaries / json的最终列表时,例如如果我只想知道伦敦的下一跳,该如何有效地过滤呢?
谢谢
[
{
"route": "10.10.4.0",
"mask": "255.255.255.0",
"next_hop": "172.18.1.5",
"site": "Edinburgh"
},
{
"route": "10.10.5.0",
"mask": "255.255.255.0",
"next_hop": "172.18.1.5",
"site": "London"
},
{
"route": "10.10.58.0",
"mask": "255.255.255.0",
"next_hop": "172.18.1.5"
},
{
"route": "10.10.8.0",
"mask": "255.255.255.0",
"next_hop": "172.16.66.34",
"site": "Manchester"
},
{
"route": "172.18.12.4",
"mask": "255.255.255.252",
"next_hop": "172.18.1.5"
}
]
答案 0 :(得分:2)
这是熊猫的一种解决方案:
In [18]: df1=pd.DataFrame(list_1)
In [19]: df2=pd.DataFrame(list_2)
In [22]: df1.merge(df2, on='route', how='left')
Out[22]:
mask next_hop route site
0 255.255.255.0 172.18.1.5 10.10.4.0 Edinburgh
1 255.255.255.0 172.18.1.5 10.10.5.0 London
2 255.255.255.0 172.16.66.34 10.10.8.0 Manchester
3 255.255.255.0 172.18.1.5 10.10.58.0 NaN
4 255.255.255.252 172.18.1.5 172.18.12.4 NaN
过滤掉没有站点的路线,例如:
In [29]: merged=df1.merge(df2, on='route', how='left')
In [31]: df=merged[~merged.site.isna()]
Out[31]:
mask next_hop route site
0 255.255.255.0 172.18.1.5 10.10.4.0 Edinburgh
1 255.255.255.0 172.18.1.5 10.10.5.0 London
2 255.255.255.0 172.16.66.34 10.10.8.0 Manchester
仅针对爱丁堡进行过滤:
df[df['site']=='Edinburgh']
以您的格式获取它:
[v for k, v in df.T.to_dict().items()]
输出:
[{'mask': '255.255.255.0',
'next_hop': '172.18.1.5',
'route': '10.10.4.0',
'site': 'Edinburgh'},
{'mask': '255.255.255.0',
'next_hop': '172.18.1.5',
'route': '10.10.5.0',
'site': 'London'},
{'mask': '255.255.255.0',
'next_hop': '172.16.66.34',
'route': '10.10.8.0',
'site': 'Manchester'}]
答案 1 :(得分:0)
import itertools
temp_merged_data = sorted(itertools.chain(list_1, list_2), key=lambda x:x['route'])
route_data = []
for k,v in itertools.groupby(temp_merged_data, key=lambda x:x['route']):
d = {}
for dct in v:
if "site" in dct.keys(): #Check if site is in keys
d.update(dct)
if d:
route_data.append(d)
print(route_data)
输出:
[{'route': '10.10.4.0', 'site': 'Edinburgh'}, {'route': '10.10.5.0', 'site': 'London'}, {'route': '10.10.8.0', 'site': 'Manchester'}]
答案 2 :(得分:0)
您可以过滤结果:
d = [{'route': '10.10.4.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5', 'site': 'Edinburgh'}, {'route': '10.10.5.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5', 'site': 'London'}, {'route': '10.10.58.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5'}, {'route': '10.10.8.0', 'mask': '255.255.255.0', 'next_hop': '172.16.66.34', 'site': 'Manchester'}, {'route': '172.18.12.4', 'mask': '255.255.255.252', 'next_hop': '172.18.1.5'}]
new_d = [i for i in d if i.get('site')]
输出:
[{'route': '10.10.4.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5', 'site': 'Edinburgh'}, {'route': '10.10.5.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5', 'site': 'London'}, {'route': '10.10.8.0', 'mask': '255.255.255.0', 'next_hop': '172.16.66.34', 'site': 'Manchester'}]
答案 3 :(得分:0)
使用实际数据分析工具,例如pandas
:
import pandas as pd
df1 = pd.DataFrame(list_1)
df2 = pd.DataFrame(list_2)
print(df1.merge(df2))
# mask next_hop route site
# 0 255.255.255.0 172.18.1.5 10.10.4.0 Edinburgh
# 1 255.255.255.0 172.18.1.5 10.10.5.0 London
# 2 255.255.255.0 172.16.66.34 10.10.8.0 Manchester
答案 4 :(得分:0)
>>> from itertools import groupby, chain
>>> temp_merged_data = sorted(chain(list_1, list_2), key=lambda x:x['route'])
>>> route_data = [dict(chain(*map(dict.items, v))) for k,v in groupby(temp_merged_data, key=lambda x:x['route'])]
>>> route_data = [d for d in route_data if 'site' in d]
>>> pprint (route_data)
[{'mask': '255.255.255.0',
'next_hop': '172.18.1.5',
'route': '10.10.4.0',
'site': 'Edinburgh'},
{'mask': '255.255.255.0',
'next_hop': '172.18.1.5',
'route': '10.10.5.0',
'site': 'London'},
{'mask': '255.255.255.0',
'next_hop': '172.16.66.34',
'route': '10.10.8.0',
'site': 'Manchester'}]
现在,如果您将路线数据转换为dict
,则可以更轻松地访问每个站点的参数
>>> route_dict = {d['site']:d for d in route_data}
>>> route_dict['London']['next_hop']
'172.18.1.5'
答案 5 :(得分:0)
鉴于这些列表的结构(路由信息和路由站点),我认为不需要合并和分组。
routes_to_sites = {rs['route']: rs['site'] for rs in list_2}
route_data = []
for ri in list_1:
site = routes_to_sites.get(ri['route'])
if site is not None:
route_data.append({**ri, 'site': site})