Question

我有一个txt文件，如下所示：

0.065998       81   
0.319601      81   
0.539613      81  
0.768445      81  
1.671893      81  
1.785064      81  
1.881242      954  
1.921503      193  
1.921605      188  
1.943166      81  
2.122283      63  
2.127669      83  
2.444705      81

第一列是数据包到达和第二个数据包大小（以字节为单位）。

我需要每秒获得字节的平均值。例如，在第一秒中，我只有值为81的数据包，因此平均比特率为81*8= 648bit/s。然后我应该以秒为单位绘制图表x轴时间，y轴每秒平均比特率。

到目前为止，我只设法将数据上传为数组：

import numpy as np

d = np.genfromtxt('data.txt')

x = (d[:,0])  
y = (d[:,1 ])

print x  
print(y*8)

我是Python的新手，所以任何帮助从哪里开始都会非常感激！

以下是结果脚本：

import matplotlib.pyplot as plt  
import numpy as np  
x, y = np.loadtxt('data.txt', unpack=True)  
bins = np.arange(60+1)  
totals, edges = np.histogram(x, weights=y, bins=bins)  
counts, edges = np.histogram(x, bins=bins)  

print counts  
print totals*0.008/counts  

plt.plot(totals*0.008/counts, 'r')  
plt.xlabel('time, s')  
plt.ylabel('kbit/s')  
plt.grid(True)  
plt.xlim(0.0, 60.0)  
plt.show()

该脚本读取包含数据包大小（字节）和到达时间的.txt文件，并绘制一段时间内的平均比特率。用于监控服务器传入/传出流量！

Answer 1

您的数据已按时间排序，因此我可能只使用itertools.groupby来存储此数据：

from itertools import groupby
with open('data.txt') as d:
     data = ([float(x) for x in line.split()] for line in d)
     for i_time,packet_info in groupby(data,key=lambda x:int(x[0])):
         print i_time, sum(x[1] for x in packet_info)

输出是：

0 324.0
1 1578.0
2 227.0

Answer 2

如果您想使用numpy，可以使用numpy.histogram：

>>> import numpy as np
>>> x, y = np.loadtxt('data.txt', unpack=True)
>>> bins = np.arange(10+1)
>>> totals, edges = np.histogram(x, weights=y, bins=bins)
>>> totals
array([  324.,  1578.,   227.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.])

这给出了每个箱子中的总数，你可以除以箱子的宽度来获得近似的瞬时率：

>>> totals/np.diff(bins)
array([  324.,  1578.,   227.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.])

（好吧，因为箱子宽度都是一个，这不是很有趣。）

[更新]

我不确定我理解你的后续评论，你需要每秒的平均数据包大小 - 我在你的问题的任何地方都没有看到提到的，但我很想知道错过了明显的... ： - /在任何情况下，如果你想要一个时间段中的数据包数，那么你根本不需要设置权重（默认为1）：

>>> counts, edges = np.histogram(x, bins=bins)
>>> counts
array([4, 6, 3, 0, 0, 0, 0, 0, 0, 0])

其中count是到达每个bin的数据包数。

Answer 3

由于到达时间不规则，我建议将它们量化为整数秒，然后聚合给定秒的所有到达的总字节数。完成后，绘图和其他分析变得更加容易。

python每秒平均比特率

3 个答案: