我正在尝试编写一个Python脚本,从其DNS中提取所有Google Cloud Compute子网。有关这方面的更多信息:
https://cloud.google.com/compute/docs/faq#where_can_i_find_short_product_name_ip_ranges
到目前为止,我可以将单个主机名的TXT记录列表作为基本字符串拉出来,没有任何问题。
import dns.resolver
# Set the resolver
my_resolver = dns.resolver.Resolver()
my_resolver.nameservers = ['8.8.8.8']
answer = my_resolver.query('_cloud-netblocks.googleusercontent.com', 'TXT')
for rdata in answer:
for txt_string in rdata.strings:
txt_record = txt_string
这给我留下了一串
v=spf1 include:_cloud-netblocks1.googleusercontent.com include:_cloud-netblocks2.googleusercontent.com include:_cloud-netblocks3.googleusercontent.com include:_cloud-netblocks4.googleusercontent.com include:_cloud-netblocks5.googleusercontent.com ?all
我想要做的是使用re.match从这个初始响应中提取5个主机名,这样我就可以连续查找并删除子网,然后将它们放入数组中。我到目前为止所有与正则表达式的努力都没有那么......很棒......我想知道是否有人会提供一些指导?谢谢!
编辑:
以下是需要收集所有Google Cloud IP的其他人的完整脚本。
import dns.resolver, re
# Set the resolver
my_resolver = dns.resolver.Resolver()
my_resolver.nameservers = ['8.8.8.8']
answer = my_resolver.query('_cloud-netblocks.googleusercontent.com', 'TXT')
for rdata in answer:
for txt_string in rdata.strings:
txt_record = txt_string
# Extract hostnames into array
hostnames = [x.split(":")[1] for x in txt_record.split() if ":" in x]
total_subnets = []
for host in hostnames:
answer = my_resolver.query(host, 'TXT')
for rdata in answer:
for txt_string in rdata.strings:
txt_record = txt_string
ip4_subnets = re.findall(r'ip4:(\S+)', txt_record)
ip6_subnets = re.findall(r'ip6:(\S+)', txt_record)
for subnet in ip4_subnets:
total_subnets.append(subnet)
for subnet in ip6_subnets:
total_subnets.append(subnet)
print total_subnets
答案 0 :(得分:1)
您不需要使用正则表达式,使用split
两次并理解:
s = "v=spf1 include:_cloud-netblocks1.googleusercontent.com include:_cloud-netblocks2.googleusercontent.com include:_cloud-netblocks3.googleusercontent.com include:_cloud-netblocks4.googleusercontent.com include:_cloud-netblocks5.googleusercontent.com ?all"
print([x.split(":")[1] for x in s.split() if ":" in x])
# => ['_cloud-netblocks1.googleusercontent.com',
# '_cloud-netblocks2.googleusercontent.com',
# '_cloud-netblocks3.googleusercontent.com',
# '_cloud-netblocks4.googleusercontent.com',
# '_cloud-netblocks5.googleusercontent.com']
请参阅demo here
<强>详情:
s.split()
- 用空格分割if ":" in x
- 仅获取内置:
x.split(":")[1]
- 使用:
拆分上述条目并获取第二个块当然,如果您愿意,可以使用正则表达式:
include:(\S+)
见demo。
这将与include:
匹配,并将1个非空白符号捕获到第1组。re.findall
将获取列表(re.findall(r'include:(\S+)', s)
)。