我正在尝试使用Pandas在预定义的一组日期中为每小时生成一组间隔。我用过:
import pandas as pd
print pd.date_range(start='2013-04-01', end='2013-04-30', freq='1H')
DatetimeIndex(['2013-04-01 00:00:00', '2013-04-01 01:00:00',
'2013-04-01 02:00:00', '2013-04-01 03:00:00',
'2013-04-01 04:00:00', '2013-04-01 05:00:00',
'2013-04-01 06:00:00', '2013-04-01 07:00:00',
'2013-04-01 08:00:00', '2013-04-01 09:00:00',
...
'2013-04-29 15:00:00', '2013-04-29 16:00:00',
'2013-04-29 17:00:00', '2013-04-29 18:00:00',
'2013-04-29 19:00:00', '2013-04-29 20:00:00',
'2013-04-29 21:00:00', '2013-04-29 22:00:00',
'2013-04-29 23:00:00', '2013-04-30 00:00:00'],
dtype='datetime64[ns]', length=697, freq='H')
然而,它每隔一小时生成一个间隔,即[0-1],[2-3],[4-5],......但是,我需要的是像[0-1]这样的分区],[1-2],[2-3],......我怎么能这样做?提前致谢。
期望的输出:
DatetimeIndex(['2013-04-01 00:00:00', '2013-04-01 01:00:00',
'2013-04-01 01:00:00', '2013-04-01 02:00:00',
'2013-04-01 02:00:00', '2013-04-01 03:00:00',
'2013-04-01 03:00:00', '2013-04-01 04:00:00',
'2013-04-01 04:00:00', '2013-04-01 05:00:00',
...
'2013-04-29 19:00:00', '2013-04-29 20:00:00',
'2013-04-29 20:00:00', '2013-04-29 21:00:00',
'2013-04-29 21:00:00', '2013-04-29 22:00:00',
'2013-04-29 22:00:00', '2013-04-29 23:00:00',
'2013-04-29 23:00:00', '2013-04-30 00:00:00'],
dtype='datetime64[ns]', length=697, freq='H')
答案 0 :(得分:1)
这是单程
In [2249]: d = pd.date_range(start='2013-04-01', end='2013-04-30', freq='H')
In [2250]: pd.DatetimeIndex([v for p in zip(d, d[1:]) for v in p])
Out[2250]:
DatetimeIndex(['2013-04-01 00:00:00', '2013-04-01 01:00:00',
'2013-04-01 01:00:00', '2013-04-01 02:00:00',
'2013-04-01 02:00:00', '2013-04-01 03:00:00',
'2013-04-01 03:00:00', '2013-04-01 04:00:00',
'2013-04-01 04:00:00', '2013-04-01 05:00:00',
...
'2013-04-29 19:00:00', '2013-04-29 20:00:00',
'2013-04-29 20:00:00', '2013-04-29 21:00:00',
'2013-04-29 21:00:00', '2013-04-29 22:00:00',
'2013-04-29 22:00:00', '2013-04-29 23:00:00',
'2013-04-29 23:00:00', '2013-04-30 00:00:00'],
dtype='datetime64[ns]', length=1392, freq=None)
或者,
In [2252]: pd.DatetimeIndex(itertools.chain(*zip(d, d[1:])))
Out[2252]:
DatetimeIndex(['2013-04-01 00:00:00', '2013-04-01 01:00:00',
'2013-04-01 01:00:00', '2013-04-01 02:00:00',
'2013-04-01 02:00:00', '2013-04-01 03:00:00',
'2013-04-01 03:00:00', '2013-04-01 04:00:00',
'2013-04-01 04:00:00', '2013-04-01 05:00:00',
...
'2013-04-29 19:00:00', '2013-04-29 20:00:00',
'2013-04-29 20:00:00', '2013-04-29 21:00:00',
'2013-04-29 21:00:00', '2013-04-29 22:00:00',
'2013-04-29 22:00:00', '2013-04-29 23:00:00',
'2013-04-29 23:00:00', '2013-04-30 00:00:00'],
dtype='datetime64[ns]', length=1392, freq=None)
答案 1 :(得分:1)
单行,直接执行:
In [237]: pd.date_range(start='2013-04-01', end='2013-04-30', freq='0.5H1U').round('1H')
Out[237]:
DatetimeIndex(['2013-04-01 00:00:00', '2013-04-01 01:00:00',
'2013-04-01 01:00:00', '2013-04-01 02:00:00',
'2013-04-01 02:00:00', '2013-04-01 03:00:00',
'2013-04-01 03:00:00', '2013-04-01 04:00:00',
'2013-04-01 04:00:00', '2013-04-01 05:00:00',
...
'2013-04-29 19:00:00', '2013-04-29 20:00:00',
'2013-04-29 20:00:00', '2013-04-29 21:00:00',
'2013-04-29 21:00:00', '2013-04-29 22:00:00',
'2013-04-29 22:00:00', '2013-04-29 23:00:00',
'2013-04-29 23:00:00', '2013-04-30 00:00:00'],
dtype='datetime64[ns]', length=1392, freq=None)
我使用的频率为半小时加上一毫秒,因此四舍五入总是落在"右侧"。