I would like to group a Pandas dataframe by hour disregarding the date.
My data:
id opened_at count sum
2016-07-01 07:02:05 1 46.14
154 2016-07-01 07:34:02 1 479
2016-07-01 10:10:01 1 127.14
2016-07-02 12:01:04 1 8.14
2016-07-02 12:00:50 1 18.14
I am able to group by hour with date taken into account by using the following.
groupByLocationDay = df.groupby([df.id,
pd.Grouper(key='opened_at', freq='3h')])
I get the following
id opened_at count sum
2016-07-01 06:00:00 2 4296.14
154 2016-07-01 09:00:00 46 43716.79
2016-07-01 12:00:00 169 150827.14
2016-07-02 12:00:00 17 1508.14
2016-07-02 09:00:00 10 108.14
How can I group by hour only, so that it would look like the following.
id opened_at count sum
06:00:00 2 4296.14
154 09:00:00 56 43824.93
12:00:00 203 152335.28
The original data is on hourly basis, thus I need to get 3h frequency. Thanks!
答案 0 :(得分:2)
you can do it this way:
In [134]: df
Out[134]:
id opened_at count sum
0 154 2016-07-01 07:02:05 1 46.14
1 154 2016-07-01 07:34:02 1 479.00
2 154 2016-07-01 10:10:01 1 127.14
3 154 2016-07-02 12:01:04 1 8.14
4 154 2016-07-02 12:00:50 1 18.14
5 154 2016-07-02 08:34:02 1 479.00
In [135]: df.groupby(['id', df.opened_at.dt.hour // 3 * 3]).sum()
Out[135]:
count sum
id opened_at
154 6 3 1004.14
9 1 127.14
12 2 26.28