我执行以下操作:
df = pd.DataFrame({'dttm_utc': pd.date_range('1/1/2012', periods=50, freq=pd.offsets.Minute(n=5))})
df['data1'] = np.random.randint(0, 500, len(df))
现在是df:
data1
dttm_utc
2012-01-01 00:00:00 266
2012-01-01 00:05:00 384
2012-01-01 00:10:00 31
2012-01-01 00:15:00 306
2012-01-01 00:20:00 180
2012-01-01 00:25:00 58
2012-01-01 00:30:00 350
2012-01-01 00:35:00 190
2012-01-01 00:40:00 359
2012-01-01 00:45:00 493
2012-01-01 00:50:00 406
2012-01-01 00:55:00 363
2012-01-01 01:00:00 188
2012-01-01 01:05:00 446
2012-01-01 01:10:00 391
2012-01-01 01:15:00 13
2012-01-01 01:20:00 135
2012-01-01 01:25:00 220
2012-01-01 01:30:00 20
2012-01-01 01:35:00 408
2012-01-01 01:40:00 189
2012-01-01 01:45:00 70
2012-01-01 01:50:00 319
2012-01-01 01:55:00 72
2012-01-01 02:00:00 422
2012-01-01 02:05:00 307
2012-01-01 02:10:00 433
2012-01-01 02:15:00 251
2012-01-01 02:20:00 361
2012-01-01 02:25:00 153
2012-01-01 02:30:00 50
2012-01-01 02:35:00 67
2012-01-01 02:40:00 366
2012-01-01 02:45:00 17
2012-01-01 02:50:00 230
2012-01-01 02:55:00 29
2012-01-01 03:00:00 59
2012-01-01 03:05:00 437
2012-01-01 03:10:00 468
2012-01-01 03:15:00 293
2012-01-01 03:20:00 412
2012-01-01 03:25:00 48
2012-01-01 03:30:00 255
2012-01-01 03:35:00 260
2012-01-01 03:40:00 98
2012-01-01 03:45:00 132
2012-01-01 03:50:00 252
2012-01-01 03:55:00 75
2012-01-01 04:00:00 441
2012-01-01 04:05:00 379
我将索引更改为“dttm_utc”
df.set_index('dttm_utc', inplace=True)
最后,作为最后一步,我想重新采样df,每小时,并执行总和和意味着在每小时的数据1上,所以我这样做:
df.resample('H').agg([np.sum, np.mean])
但是我收到了这个错误:
AttributeError: 'DataFrame' object has no attribute 'agg'
我该如何克服这个问题?请帮忙!
答案 0 :(得分:5)
使用Pandas 0.17.1。
>>> df.resample('H', how=['sum', 'mean'])
data1
sum mean
dttm_utc
2012-01-01 00:00:00 2566 213.833333
2012-01-01 01:00:00 2667 222.250000
2012-01-01 02:00:00 3154 262.833333
2012-01-01 03:00:00 3897 324.750000
2012-01-01 04:00:00 329 164.500000