使用scapy,iftop样式计算每个IP的带宽使用情况

时间:2014-01-13 15:32:57

标签: python networking network-programming scapy

我正在使用scapy来嗅探镜像端口并生成前10个“谈话者”的列表,即使用我网络上带宽最多的主机列表。我知道已有的工具,例如iftopntop,但我需要对输出进行更多控制。

以下脚本对流量进行30秒的采样,然后以“源主机 - >目标主机:字节”格式打印前10个谈话者的列表。这很好,但我怎样才能计算每秒平均字节

我意识到将sample_interval更改为1秒不允许对流量进行良好的采样,因此我似乎需要将其平均化。所以我在剧本的最后尝试了这个:

  

每秒字节数=(总字节数/ sample_interval)

但是得到的Bytes / s似乎要低得多。例如,我以1.5 MB / s的节流速率在两台主机之间生成了一个rsync,但是使用上面的平均计算,我的脚本一直计算这些主机之间的速率大约为200 KB / s ...远低于1.5 MB / s正如我所料。我可以用iftop确认1.5 MB / s实际上就是这两个主机之间的速率。

我是否通过scapy错误地总计数据包长度(请参阅traffic_monitor_callbak函数)?或者这完全是一个糟糕的解决方案:)?

from scapy.all import *
from collections import defaultdict
import socket
from pprint import pprint
from operator import itemgetter

sample_interval = 30  # how long to capture traffic, in seconds

# initialize traffic dict
traffic = defaultdict(list)

# return human readable units given bytes
def human(num):
    for x in ['bytes','KB','MB','GB','TB']:
        if num < 1024.0:
            return "%3.1f %s" % (num, x)
        num /= 1024.0

# callback function to process each packet
# get total packets for each source->destination combo
def traffic_monitor_callbak(pkt):
    if IP in pkt:
        src = pkt.sprintf("%IP.src%")
        dst = pkt.sprintf("%IP.dst%")

        size = pkt.sprintf("%IP.len%")

        # initialize
        if (src, dst) not in traffic:
            traffic[(src, dst)] = 0

        else:
            traffic[(src, dst)] += int(size)

sniff(iface="eth1", prn=traffic_monitor_callbak, store=0, timeout=sample_interval)

# sort by total bytes, descending
traffic_sorted = sorted(traffic.iteritems(), key=itemgetter(1), reverse=True)    

# print top 10 talkers
for x in range(0, 10):
    src = traffic_sorted[x][0][0]
    dst = traffic_sorted[x][0][1]
    host_total = traffic_sorted[x][3]

    # get hostname from IP
    try:
        src_hostname = socket.gethostbyaddr(src)
    except:
        src_hostname = src

    try:    
        dst_hostname = socket.gethostbyaddr(dst)
    except:
        dst_hostname = dst


    print "%s: %s (%s) -> %s (%s)" % (human(host_total), src_hostname[0], src, dst_hostname[0], dst)

我不确定这是一个编程(scapy / python)问题还是更多的一般网络问题,所以我称之为网络编程问题。

2 个答案:

答案 0 :(得分:3)

您好,

首先,您发布的代码中存在错误:代替host_total = traffic_sorted[x][3],您可能意味着host_total = traffic_sorted[x][1]

然后,您出现错误:忘记将host_total除以sample_interval值。

由于您还想添加接收方到发送方的流量和发送方到接收方,我认为最好的方法是使用&#34; ordered&#34;元组(这里的顺序本身并不重要,字典顺序可能没问题,但你也可以使用算术顺序,因为IP地址是4个八位字节的整数)作为Counter对象的键。这似乎工作得很好:

#! /usr/bin/env python

sample_interval = 10
interface="eth1"

from scapy.all import *
from collections import Counter


# Counter is a *much* better option for what you're doing here. See
# http://docs.python.org/2/library/collections.html#collections.Counter
traffic = Counter()
# You should probably use a cache for your IP resolutions
hosts = {}

def human(num):
    for x in ['', 'k', 'M', 'G', 'T']:
        if num < 1024.: return "%3.1f %sB" % (num, x)
        num /= 1024.
    # just in case!
    return  "%3.1f PB" % (num)

def traffic_monitor_callback(pkt):
    if IP in pkt:
        pkt = pkt[IP]
        # You don't want to use sprintf here, particularly as you're
        # converting .len after that!
        # Here is the first place where you're happy to use a Counter!
        # We use a tuple(sorted()) because a tuple is hashable (so it
        # can be used as a key in a Counter) and we want to sort the
        # addresses to count mix sender-to-receiver traffic together
        # with receiver-to-sender
        traffic.update({tuple(sorted(map(atol, (pkt.src, pkt.dst)))): pkt.len})

sniff(iface=interface, prn=traffic_monitor_callback, store=False,
      timeout=sample_interval)

# ... and now comes the second place where you're happy to use a
# Counter!
# Plus you can use value unpacking in your for statement.
for (h1, h2), total in traffic.most_common(10):
    # Let's factor out some code here
    h1, h2 = map(ltoa, (h1, h2))
    for host in (h1, h2):
        if host not in hosts:
            try:
                rhost = socket.gethostbyaddr(host)
                hosts[host] = rhost[0]
            except:
                hosts[host] = None
    # Get a nice output
    h1 = "%s (%s)" % (hosts[h1], h1) if hosts[h1] is not None else h1
    h2 = "%s (%s)" % (hosts[h2], h2) if hosts[h2] is not None else h2
    print "%s/s: %s - %s" % (human(float(total)/sample_interval), h1, h2)

Scapy可能不够快,无法完成这项工作。可以肯定的是,您可以使用例如tcpdump -w,将您的流量捕获到文件sample_interval秒,然后运行(顺便说一下,看一下将函数应用到数据包的方式,我认为它很好如果你经常使用Scapy,那就知道了:

#! /usr/bin/env python

sample_interval = 10
filename="capture.cap"

from scapy.all import *
from collections import Counter

traffic = Counter()
hosts = {}

def human(num):
    for x in ['', 'k', 'M', 'G', 'T']:
        if num < 1024.: return "%3.1f %sB" % (num, x)
        num /= 1024.
    return  "%3.1f PB" % (num)

def traffic_monitor_callback(pkt):
    if IP in pkt:
        pkt = pkt[IP]
        traffic.update({tuple(sorted(map(atol, (pkt.src, pkt.dst)))): pkt.len})

# A trick I like: don't use rdpcap() that would waste your memory;
# iterate over a PcapReader object instead.
for p in PcapReader("capture.cap"):
    traffic_monitor_callback(p)

for (h1, h2), total in traffic.most_common(10):
    h1, h2 = map(ltoa, (h1, h2))
    for host in (h1, h2):
        if host not in hosts:
            try:
                rhost = socket.gethostbyaddr(host)
                hosts[host] = rhost[0]
            except:
                hosts[host] = None
    h1 = "%s (%s)" % (hosts[h1], h1) if hosts[h1] is not None else h1
    h2 = "%s (%s)" % (hosts[h2], h2) if hosts[h2] is not None else h2
    print "%s/s: %s - %s" % (human(float(total)/sample_interval), h1, h2)

答案 1 :(得分:0)

这可能不是它,但你可能每秒混合mega * *(Mb / s)和每秒超级* 字节 *(MB / s) )?您似乎正在测量以字节发送的数据量,然后将其转换为MB / s,但我想知道您是否设置了比特规格为1.5 Mb /秒的rsync。如果是这样,你的脚本给你200 kB / s的事实至少在1.5 Mb / s的正确球场......