我有下面的CSV文件
DateTIme,172.25.150.88,172.25.150.12,172.25.150.105,172.25.150.43,172.25.150.47,172.25.150.95
2018-11-0202:49:42,54457,51776,43164,52074,48227,52165
2018-11-0202:49:43,48728,48516,47605,48202,48077,48304
2018-11-0202:49:44,47879,48699,48243,48153,48483,48364
使用大熊猫导入文件并设置日期时间索引:
data = pd.read_csv("throughput_88_12_105_43_47_95.csv")
data['datetime'] = pd.to_datetime(data['DateTIme'],format='%Y-%m-%d%H:%M:%S')
现在数据数据框如下所示:
日期时间172.25.150.88 172.25.150.12 172.25.150.105 172.25.150.43 172.25.150.47 172.25.150.95日期时间 0 2018-11-0202:49:42 54457 51776 43164 52074 48227 52165 2018-11-02 02:49:42
1 2018-11-0202:49:43 48728 48516 47605 48202 48077 48304 2018-11-02 02:49:43
2 2018-11-0202:49:44 47879 48699 48243 48153 48483 48364 2018-11-02 02:49:44
3 2018-11-0202:49:45 48009 48751 47813 48359 48581 48793 2018-11-02 02:49:45
4 2018-11-0202:49:46 48905 48650 47578 48285 48055 48761 2018-11-02 02:49:46
要在所有列上平均1分钟:
df = pd.DataFrame(data = data, columns = ['172.25.150.88','172.25.150.12','172.25.150.105','172.25.150.43','172.25.150.47','172.25.150.95'],index=data['datetime'])
df.resample('1Min').mean()
这给了我
172.25.150.88 172.25.150.12 172.25.150.105 172.25.150.43 172.25.150.47 172.25.150.95 日期时间
2018-11-02 02:49:00 NaN NaN NaN NaN NaN NaN NaN2018-11-02 02:50:00 NaN NaN NaN NaN NaN NaN NaN
2018-11-02 02:51:00 NaN NaN NaN NaN NaN NaN NaN
如何获得1分钟的平均值?我得到的只是NAN。
答案 0 :(得分:1)
看到这个:
In [1702]: df
Out[1702]:
DateTIme 172.25.150.88 172.25.150.12 172.25.150.105 172.25.150.43 172.25.150.47 172.25.150.95
0 2018-11-0202:49:42 54457 51776 43164 52074 48227 52165
1 2018-11-0202:49:43 48728 48516 47605 48202 48077 48304
2 2018-11-0202:49:44 47879 48699 48243 48153 48483 48364
In [1703]: df.DateTIme=pd.to_datetime(df.DateTIme,format='%Y-%m-%d%H:%M:%S')
In [1707]: df.resample(rule='1Min', on='DateTIme').mean()
Out[1707]:
172.25.150.88 172.25.150.12 172.25.150.105 172.25.150.43 172.25.150.47 172.25.150.95
DateTIme
2018-11-02 02:49:00 50354.666667 49663.666667 46337.333333 49476.333333 48262.333333 49611.0
让我知道这是否有帮助。