我有一个这样的Pandas数据帧(“时间戳记”位于datetime type
中,索引列。):
Server Citta Nazione download Mb/s upload Mb/s ping Isp
Timestamp
2020-04-01 11:02:04 AlternatYva srl Rome Italy 12.550000 0.890000 70.918 Warian S.R.L.
2020-04-01 11:04:12 AlternatYva srl Rome Italy 10.880000 0.510000 64.908 Warian S.R.L.
2020-04-01 11:06:07 Fastweb SpA Rome Italy 11.200000 0.650000 63.223 Warian S.R.L.
2020-03-23 05:00:13 Fastweb SpA Rome Italy 13.956026 0.629037 31.809 Warian S.R.L.
2020-03-23 05:02:08 AlternatYva srl Rome Italy 10.887535 0.224637 31.200 Warian S.R.L.
... ... ... ... ... ... ... ...
2020-04-07 09:03:37 AlternatYva srl Rome Italy 12.560000 1.030000 55.119 Warian S.R.L.
2020-04-07 09:05:12 Fastweb SpA Rome Italy 13.640000 0.770000 29.715 Warian S.R.L.
2020-04-25 02:01:52 AlternatYva srl Rome Italy 10.990000 0.040000 74.318 Warian S.R.L.
2020-04-25 02:03:28 Telecom Italia S.p.A. Rome Italy 11.510000 1.090000 137.830 Warian S.R.L.
2020-04-25 02:04:56 Telecom Italia S.p.A. Rome Italy 12.960000 0.330000 65.324 Warian S.R.L.
[6726 rows x 7 columns]
我想使用每小时“下载Mb / s”列中的下载平均值创建一个新的df。两栏HOUR-平均值。像这样:
HOUR mean
0 12.
1 13.5
2 4.8
3 9.6
...
23 10.2
到目前为止,我可以通过mean()
函数计算主数据帧的“下载Mb / s”列的值。而且我了解到,借助between_time()
函数,我可以选择索引列“时间戳”在小时之间的所有行。
如上所述将两个函数聚合在一起以获得数据帧的正确方法是什么?
答案 0 :(得分:0)
您可以分组:
df.groupby(df.index.hour)[['download Mb/s', 'upload Mb/s']].mean()