我有一个属于多个域和子域的主机列表。我试图将列表转换为dict:list,以便主机按域/子域进行组织。
蟒蛇' in'字符串匹配将匹配所有子域和域。我正在尝试/(?!sub).domain /作为我的正则表达式,但似乎并没有正确匹配。
尝试根据List2
将List1翻译为Dict# A list of every host
host_list = [
'host1.domain.com',
'host2.domain.com',
'host20.sub.domain.com',
'host31.sub.domain.com',
'host1.example.com',
'host1.sub.example.com'
]
# A list of all domains we want to organize in the dictionary
domain_list = [
'two.sub.domain',
'sub.example',
'sub.domain',
'domain',
'example'
]
期望的结果
domain_dict = {
'domain': ['host1.domain.com', 'host2.domain.com'],
'sub.domain': ['host20.sub.domain.com', 'host31.sub.domain.com'],
'example': ['host1.sub.example.com'],
'sub.example': ['host1.sub.example.com']
}
我们仍有一个域列表并支持多个子域的解决方案。
关于这一点的一个警告是,域名列表需要从最深(最具体)的子域开始。在域之前,请参阅domain_list订单 sub.domain 。
# We want to protect the original host list
host_list_copy = list(host_list)
for domain in domain_list:
# Get only the hosts that are part of the same subdomain/domain
temp_host_list = [x for x in host_list_copy if (domain in x)]
# Add the list to the dictionary
domain_dict[domain] = temp_host_list
# Remove the temp_host_list records from the original host_list_copy
host_list_copy[:] = [x for x in host_list_copy if x not in temp_host_list]
答案 0 :(得分:1)
使用条件:
list1 = [
'host1.domain.com',
'host2.domain.com',
'host20.sub.domain.com',
'host31.sub.domain.com',
'host1.example.com',
'host1.sub.example.com'
]
list2 = [
'domain',
'example'
]
list3 = [
'sub.domain',
'sub.example'
]
my_dict = {i:[] for i in list2 + list3}
for i in list1:
for j in zip(list2, list3):
if j[1] in i:
my_dict[j[1]].append(i)
elif j[0] in i:
my_dict[j[0]].append(i)
答案 1 :(得分:1)
以下是我将如何做到这一点(经过十亿次编辑后):
hosts = [
'host1.domain.com',
'host2.domain.com',
'host20.sub.domain.com',
'host31.sub.domain.com',
'host1.example.com',
'host1.sub.example.com'
]
domains = [
'domain',
'sub.domain',
'example',
'sub.example'
]
import re
import pprint
dot = r'.'
anything_but_dot = r'[^.]*'
prefix = anything_but_dot + dot
answer = {}
for domain in domains:
compiled = re.compile(prefix + domain)
answer[domain] = []
for host in hosts:
if compiled.match(host):
answer[domain].append(host)
pprint.pprint(answer)
这会得到结果:
{'domain': ['host1.domain.com', 'host2.domain.com'],
'example': ['host1.example.com'],
'sub.domain': ['host20.sub.domain.com', 'host31.sub.domain.com'],
'sub.example': ['host1.sub.example.com']}