pandas.interval_range用于部分间隔

时间:2019-02-06 15:03:33

标签: python pandas

我正在使用pd.interval_range在一对时间戳中生成每小时的时间间隔:

In [1]: list(pd.interval_range(pd.Timestamp('2019-02-06 07:00:00'), 
                               pd.Timestamp('2019-02-06 08:00:00'), freq='h'))
Out[1]: [Interval('2019-02-06 07:00:00', '2019-02-06 08:00:00', closed='right')]

当结束时间不在一个小时的边界上时,是否可以生成一个小于1小时的间隔?

换句话说,当我将结束时间移动1分钟时,我会得到:

In [2]: list(pd.interval_range(pd.Timestamp('2019-02-06 07:00:00'), 
                               pd.Timestamp('2019-02-06 08:01:00'), freq='h'))
Out[2]: [Interval('2019-02-06 07:00:00', '2019-02-06 08:00:00', closed='right')]

我想得到这个:

In [2]: list(pd.interval_range(pd.Timestamp('2019-02-06 07:00:00'), 
                               pd.Timestamp('2019-02-06 08:01:00'), freq='h'))
Out[2]: [Interval('2019-02-06 07:00:00', '2019-02-06 08:00:00', closed='right'),
         Interval('2019-02-06 08:00:00', '2019-02-06 08:01:00', closed='right')]

3 个答案:

答案 0 :(得分:1)

尝试:

start = pd.Timestamp('2019-02-06 07:00:00')
end = pd.Timestamp('2019-02-06 09:01:00')

interval_1 = pd.interval_range(start, 
                               end, freq='h')

interval_out = pd.IntervalIndex.from_arrays(interval_1.left.to_series().tolist() +[interval_1.right[-1]], 
                                            interval_1.right.to_series().tolist() +[end])
interval_out

输出:

IntervalIndex([(2019-02-06 07:00:00, 2019-02-06 08:00:00], (2019-02-06 08:00:00, 2019-02-06 09:00:00], (2019-02-06 09:00:00, 2019-02-06 09:01:00]]
              closed='right',
              dtype='interval[datetime64[ns]]')

答案 1 :(得分:1)

根据Scott的建议,这是我的解决方案,将长存根放在日程表的开头和结尾:

def interval_range_with_partial_hour(start_time, end_time, freq, closed='right'):
    if start_time == end_time:
        return pd.IntervalIndex.from_arrays(left=[], right=[], closed=closed)

    index = pd.interval_range(start_time.floor(freq), end_time.ceil(freq), freq=freq, closed=closed)
    assert len(index) > 0

    left, right = index.left.to_series().tolist(), index.right.to_series().tolist()
    assert left[0] <= start_time
    assert right[-1] >= end_time

    left[0] = start_time
    right[-1] = end_time
    return pd.IntervalIndex.from_arrays(left=left, right=right, closed=index.closed)

答案 2 :(得分:0)

您可以预先找出您感兴趣的剩余单位是什么。如果您对每小时的Timedeltas感兴趣,但想知道几秒钟的剩余时间,则可以例如:

delta = pd.Timestamp('2019-02-06 08:03:00') - pd.Timestamp('2019-02-06 07:00:00')
delta.seconds % 3600

在这种情况下,您知道还剩180秒,也许可以适当地处理剩余时间,例如通过将列表附加一个较小的间隔来补充。