Pandas groupby在当天的开始

时间:2017-01-24 17:11:41

标签: python pandas numpy

如何在当天开始时开始分组?

df.groupby(pd.TimeGrouper(freq='3600s')).aggregate(np.mean)

使用此代码输入:

2016-12-09 22:00:00         0.78
2016-12-09 23:30:00         0.37
2016-12-10 00:20:00         0.24
2016-12-10 01:30:00         0.22
2016-12-10 02:00:00         0.19

使用此代码输出:

2016-12-09 22:00:00         0.78
2016-12-09 23:00:00         0.37
2016-12-10 00:00:00         0.24
2016-12-10 01:00:00         0.22
2016-12-10 02:00:00         0.19

目标输出:

2016-12-09 00:00:00         Nan
2016-12-09 01:00:00         Nan
...
2016-12-09 22:00:00         0.78
2016-12-09 23:00:00         0.37
2016-12-10 00:00:00         0.24
2016-12-10 01:00:00         0.22
2016-12-10 02:00:00         0.19

1 个答案:

答案 0 :(得分:0)

我认为重新取样并找到平均值后,最简单的方法就是重新索引数据帧。

df['date'] = pd.to_datetime(df.date)
df1 = df.set_index('date').resample('h').mean()

new_idx = pd.date_range(df1.index.min().date(), df1.index.max(), freq='h')
df1.reindex(new_idx)

                     value
2016-12-09 00:00:00    NaN
2016-12-09 01:00:00    NaN
2016-12-09 02:00:00    NaN
2016-12-09 03:00:00    NaN
2016-12-09 04:00:00    NaN
2016-12-09 05:00:00    NaN
2016-12-09 06:00:00    NaN
2016-12-09 07:00:00    NaN
2016-12-09 08:00:00    NaN
2016-12-09 09:00:00    NaN
2016-12-09 10:00:00    NaN
2016-12-09 11:00:00    NaN
2016-12-09 12:00:00    NaN
2016-12-09 13:00:00    NaN
2016-12-09 14:00:00    NaN
2016-12-09 15:00:00    NaN
2016-12-09 16:00:00    NaN
2016-12-09 17:00:00    NaN
2016-12-09 18:00:00    NaN
2016-12-09 19:00:00    NaN
2016-12-09 20:00:00    NaN
2016-12-09 21:00:00    NaN
2016-12-09 22:00:00   0.78
2016-12-09 23:00:00   0.37
2016-12-10 00:00:00   0.24
2016-12-10 01:00:00   0.22
2016-12-10 02:00:00   0.19