使用dpkt库从DNS响应数据包中提取域名

时间:2013-07-26 12:09:13

标签: parsing python-2.7 pcap

我正在尝试使用可用的dpkt库生成pcap文件中所有域名及其相应IP地址的列表here

我的代码主要基于this

filename = raw_input('Type filename of pcap file (without extention): ')
path = 'c:/temp/PcapParser/' + filename + '.pcap'
f = open(path, 'rb')
pcap = dpkt.pcap.Reader(f)
for ts, buf in pcap:
    #make sure we are dealing with IP traffic
    try:
        eth = dpkt.ethernet.Ethernet(buf)
    except:
        continue
    if eth.type != 2048:
        continue
    #make sure we are dealing with UDP protocol
    try:
        ip = eth.data
    except:
        continue
    if ip.p != 17:
        continue
    #filter on UDP assigned ports for DNS
    try:
        udp = ip.data
    except:
        continue
    if udp.sport != 53 and udp.dport != 53:
        continue
    #make the dns object out of the udp data and
    #check for it being a RR (answer) and for opcode QUERY
    try:
        dns = dpkt.dns.DNS(udp.data)
    except:
        continue
    if dns.qr != dpkt.dns.DNS_R:
        continue
    if dns.opcode != dpkt.dns.DNS_QUERY:
        continue
    if dns.rcode != dpkt.dns.DNS_RCODE_NOERR:
        continue
    if len(dns.an) < 1:
        continue
    #process and print responses based on record type
    for answer in dns.an:
        if answer.type == 1: #DNS_A
            print 'Domain Name: ', answer.name, '\tIP Address: ', socket.inet_ntoa(answer.rdata)

问题是answer.name对我来说不够好,因为我需要原始的域名,而不是它的'CNAME表示。例如,原始DNS请求之一是www.paypal.com,但其CNAME表示为paypal.112.2o7.net

我仔细查看了代码并意识到我实际上是从DNS响应中提取信息(而不是查询)。然后我查看了wireshark中的响应数据包,看到原始域名在“查询”下和“答案”下,所以我的问题是如何提取它?

谢谢!

1 个答案:

答案 0 :(得分:4)

为了通过dpkt.dns提供的dns.qd对象从DNS响应的Questions部分获取名称,我所要做的就是:

for qname in dns.qd:
    print qname.name