我有字典:
http://192.168.1.5 and http://192.168.1.18
两个条目具有一些IP地址,但是hostServiceDict = {"http://192.168.1.1:80/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO'],
"http://192.168.1.2:80/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO', 'UDDC'],
"http://192.168.1.3:80/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO', 'HTTPServer'],
"http://192.168.1.4:8080/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO', 'NetcdfSubset'],
"http://192.168.1.5:8080/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO', 'WCS', 'NCSS'],
"http://192.168.1.6:80/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO', 'DAP4'],
"http://192.168.1.7:80/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO', 'NCML', 'DAP4'],
"http://192.168.1.8:80/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO', 'NetcdfSubset'],
"http://192.168.1.9:80/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO', 'UDDC'],
"http://192.168.1.18:80/thredds/catalog.xml": ['OPENDAP', 'WMS', 'HTTP', 'ISO', 'NetcdfSubset'],
}
的端口部分不同。我需要删除第二个重复的对象,使其像这样:
result = {}
for urls, services in hostServiceDict.items():
i = urls.strip('http://').strip('thredds/catalog.xml').split(':')
ip = i[0]
if ip not in result.items():
if ip in urls:
result[urls] = services
print(result)
我已经尝试过了,但是它仍然给我与原点相同的结果:
{{1}}
答案 0 :(得分:2)
if ip not in result.items():
永远找不到ip
,因为IP不在results
中。
您必须跟踪所见IP:
result = {}
seen_ips = set()
for url, services in hostServiceDict.items():
ip = url.strip('http://').strip('thredds/catalog.xml').split(':')[0]
if ip not in seen_ips:
seen_ips.add(ip)
result[url] = services
print(result)
为使代码更好,您可以进行真正的URL解析:
import re
def get_host(url):
return re.match(r'https?://([^:/]+).*', url).groups(0)
然后,更容易制作一个宿主->(URL,服务)字典,而不是“手动”删除重复项:
data_by_hostname = {get_host(url): (url, services)
for url, services in hostServiceDict.items()}
此命令负责删除重复的主机名。
然后,如果需要,可以再次根据以下值构造url->服务字典:
result = dict(data_by_hostname.values())
答案 1 :(得分:1)
您可以通过具有列表并使用已跟踪的ip验证新ip来跟踪不同的ip,这将需要对逻辑进行一些小的更改,如下所示:
result = {}
distinct_ips = []
for urls, services in hostServiceDict.items():
i = urls.strip('http://').strip('thredds/catalog.xml').split(':')
ip = i[0]
if ip not in distinct_ips:
distinct_ips.append(ip)
if ip in urls:
result[urls] = services
print(result)