我目前正在处理解析dig命令的输出。该命令输出规范名称,然后输出最后一条记录的实际IP。
例如,解析dig mail.yahoo.com
会执行以下操作:
borrajax@borrajax.kom /tmp/ $ dig @8.8.8.8 @4.2.2.2 +nocomments \
+noquestion +noauthority +noadditional \
+nostats +nocmd mail.yahoo.com
mail.yahoo.com. 0 IN CNAME login.yahoo.com.
login.yahoo.com. 0 IN CNAME ats.login.lgg1.b.yahoo.com.
ats.login.lgg1.b.yahoo.com. 0 IN CNAME ats.member.g02.yahoodns.net.
ats.member.g02.yahoodns.net. 0 IN CNAME any-ats.member.a02.yahoodns.net.
any-ats.member.a02.yahoodns.net. 49 IN A 98.139.21.169
所以我希望能够说mail.yahoo.com
解析为98.139.21.169
,为了做到这一点,我需要"合并" mail.yahoo.com
进入login.yahoo.com
,然后login.yahoo.com
进入ats.login.lgg1.b.yahoo.com
...等等......直到到达最后A
条记录。
在another question中我已经有了一个很好的正则表达式来解析dig
的输出,所以我可以很好地清理这些行并存储到列表中:
[
('mail.yahoo.com', 'CNAME', 'login.yahoo.com'),
('login.yahoo.com', 'CNAME', 'ats.login.lgg1.b.yahoo.com'),
('ats.login.lgg1.b.yahoo.com', 'CNAME', 'ats.member.g02.yahoodns.net'),
('ats.member.g02.yahoodns.net', 'CNAME', 'any-ats.member.a02.yahoodns.net'),
('any-ats.member.a02.yahoodns.net', 'A', '98.139.21.169')
]
问题是:我怎么能有效地做到这一点,并且以一般的方式,所以如果我在CNAME
之间有一些其他的随机线,那么它也会起作用:
[
('mail.yahoo.com', 'CNAME', 'login.yahoo.com'),
('foo.com', 'CNAME', 'baz.com'), # Wooops, watch out!
('login.yahoo.com', 'CNAME', 'ats.login.lgg1.b.yahoo.com'),
('ats.login.lgg1.b.yahoo.com', 'CNAME', 'ats.member.g02.yahoodns.net'),
('baz.com', 'A', '204.236.134.199'), # Wooops, watch out!
('ats.member.g02.yahoodns.net', 'CNAME', 'any-ats.member.a02.yahoodns.net'),
('any-ats.member.a02.yahoodns.net', 'A', '98.139.21.169')
]
所需的输出是:
mail.yahoo.com
解析为98.139.21.169
foo.com
解析为204.236.134.199
当然,我可以检查所有CNAMES
以及每次我找到它时实际解决的内容,但那会是O(n^2)
......而且它会...可怕。
我确信有更好的方法,但我无法思考。提前感谢任何想法。
答案 0 :(得分:1)
我会构建一个dict
并从那里解析链:
data = [
('mail.yahoo.com', 'CNAME', 'login.yahoo.com'),
('foo.com', 'CNAME', 'baz.com'), # Wooops, watch out!
('login.yahoo.com', 'CNAME', 'ats.login.lgg1.b.yahoo.com'),
('ats.login.lgg1.b.yahoo.com', 'CNAME', 'ats.member.g02.yahoodns.net'),
('baz.com', 'A', '204.236.134.199'), # Wooops, watch out!
('ats.member.g02.yahoodns.net', 'CNAME', 'any-ats.member.a02.yahoodns.net'),
('any-ats.member.a02.yahoodns.net', 'A', '98.139.21.169')
]
data = { t[0]:t[1:] for t in data }
def lookup(host):
record_type = None
while record_type != 'A':
record_type, host = data[host]
return host
assert lookup('mail.yahoo.com') == '98.139.21.169'
assert lookup('foo.com') == lookup('baz.com') == '204.236.134.199'
答案 1 :(得分:0)
这是我的解决方案(有关算法的更多信息,请参阅注释):
import copy
def resolve(arr):
# create an index for easy access of the urls
index = {item[0]: item[2] for item in arr}
# copy the index
mapping = copy.copy(index)
# loop through the index
for index_key in index:
# get the current value
value = index[index_key]
# loop through the mapping as long as the final ip address is reached
# but only if this url wasn't found before
while value in mapping:
# remember the new key (so it can be deleted afterwards)
key = value
# get the new value
value = mapping[key]
# save the found value as the new value (for later use)
# this reduces the complexity (-> better performance)
mapping[index_key] = value
# delete the "one in the middle" out of the mapping array
# so that the next item don't have to search for
# the correct mapping (because the mapping has been found already)
del mapping[key]
return mapping
使用此脚本,无论列表如何排序,您都可以看到它生成相同的输出:
import random
data = [
('mail.yahoo.com', 'CNAME', 'login.yahoo.com'),
('foo.com', 'CNAME', 'baz.com'), # Wooops, watch out!
('login.yahoo.com', 'CNAME', 'ats.login.lgg1.b.yahoo.com'),
('ats.login.lgg1.b.yahoo.com', 'CNAME', 'ats.member.g02.yahoodns.net'),
('baz.com', 'A', '204.236.134.199'), # Wooops, watch out!
('ats.member.g02.yahoodns.net', 'CNAME', 'any-ats.member.a02.yahoodns.net'),
('any-ats.member.a02.yahoodns.net', 'A', '98.139.21.169')
]
# test 50 times
for x in xrange(50):
# shuffle the data array
random.shuffle(data)
print resolve(data)