在熊猫中按时间切割

时间:2014-05-28 16:26:04

标签: python numpy pandas

是否有更有效的时间间隔分组方法? 我希望在[00:00 - 12:00],[12:00 - 16:00],[16:00 - 00:00]分组

s = Series({
    datetime(2014, 1, 10, 0): 1,
    datetime(2014, 1, 10, 10): 2,
    datetime(2014, 1, 10, 11): 3,
    datetime(2014, 1, 12, 12): 3,
    datetime(2014, 1, 15, 17): 4,
    datetime(2014, 1, 15, 22): 5
})

arr = array([time(0), time(12), time(16)])
print s.groupby(lambda x: arr[::-1][(arr[::-1] <= x.time()).argmax()]).sum()

    00:00:00    6
    12:00:00    3
    16:00:00    9
    dtype: int64

另外,我想标记每个组中出现最后一个日期时间的新索引值:

    2014-01-10 11:00:00    6
    2014-01-12 12:00:00    3
    2014-01-15 22:00:00    9
    dtype: int64

1 个答案:

答案 0 :(得分:0)

你的时间不规律,所以有点棘手

In [68]: times = ['00:00','12:00','16:00']

In [69]: Series(dict([ (start,s.between_time(start,end,include_end=False).sum()) for start,end in zip(times,times[1:]+[times[0]]) ]))
Out[69]: 
00:00    6
12:00    3
16:00    9
dtype: int64

在常规网格上,这总结了正确的时间

In [75]: x = s.resample('4H',how='sum',closed='left')

In [76]: x.groupby(x.index.time).sum()
Out[76]: 
00:00:00     1
04:00:00   NaN
08:00:00     5
12:00:00     3
16:00:00     4
20:00:00     5
dtype: float64