我有以下数据框:
> cal.loc['2007-11-20':'2007-11-30']
market_open market_close
2007-11-20 2007-11-20 14:30:00+00:00 2007-11-20 21:00:00+00:00
2007-11-21 2007-11-21 14:30:00+00:00 2007-11-21 21:00:00+00:00
2007-11-23 2007-11-23 14:30:00+00:00 2007-11-23 18:00:00+00:00
2007-11-26 2007-11-26 14:30:00+00:00 2007-11-26 21:00:00+00:00
2007-11-27 2007-11-27 14:30:00+00:00 2007-11-27 21:00:00+00:00
2007-11-28 2007-11-28 14:30:00+00:00 2007-11-28 21:00:00+00:00
2007-11-29 2007-11-29 14:30:00+00:00 2007-11-29 21:00:00+00:00
2007-11-30 2007-11-30 14:30:00+00:00 2007-11-30 21:00:00+00:00
我想根据上面的数据框建立一个自定义日期时间索引,其频率为每分钟1分钟和特定范围。
例如:
2007-11-21 14:30
2007-11-21 14:31
2007-11-21 14:32
2007-11-21 14:33
2007-11-21 14:34
...
2007-11-21 21:00
2007-11-23 14:30
2007-11-23 14:31
...
2007-11-23 18:00
2007-11-26 14:30
...
由于
答案 0 :(得分:0)
from functools import reduce
reduce(
pd.DatetimeIndex.union,
(pd.date_range(o, c, freq='T') for i, o, c in df.itertuples())
)
DatetimeIndex(['2007-11-20 14:30:00', '2007-11-20 14:31:00',
'2007-11-20 14:32:00', '2007-11-20 14:33:00',
'2007-11-20 14:34:00', '2007-11-20 14:35:00',
'2007-11-20 14:36:00', '2007-11-20 14:37:00',
'2007-11-20 14:38:00', '2007-11-20 14:39:00',
...
'2007-11-30 20:51:00', '2007-11-30 20:52:00',
'2007-11-30 20:53:00', '2007-11-30 20:54:00',
'2007-11-30 20:55:00', '2007-11-30 20:56:00',
'2007-11-30 20:57:00', '2007-11-30 20:58:00',
'2007-11-30 20:59:00', '2007-11-30 21:00:00'],
dtype='datetime64[ns]', length=2948, freq=None)
答案 1 :(得分:0)
您可以通过循环遍历行并为每个行创建索引,然后在最后组合索引来实现此目的
#setup data
df = pd.DataFrame([["2007-11-20", "2007-11-20 14:30:00", "2007-11-20 21:00:00"],
["2007-11-21", "2007-11-21 14:30:00", "2007-11-21 21:00:00"],
["2007-11-23", "2007-11-23 14:30:00", "2007-11-23 18:00:00"],
["2007-11-26", "2007-11-26 14:30:00", "2007-11-26 21:00:00"],
["2007-11-27", "2007-11-27 14:30:00", "2007-11-27 21:00:00"],
["2007-11-28", "2007-11-28 14:30:00", "2007-11-28 21:00:00"],
["2007-11-29", "2007-11-29 14:30:00", "2007-11-29 21:00:00"],
["2007-11-30", "2007-11-30 14:30:00", "2007-11-30 21:00:00"]], columns=["day", "market_open", "market_close"])
df['day'] = pd.to_datetime(df['day'])
df['market_open'] = pd.to_datetime(df['market_open'])
df['market_close'] = pd.to_datetime(df['market_close'])
#code for calculating the index
idxs = []
for i, r in df.iterrows():
idx = pd.DatetimeIndex(freq="1T", start=r['market_open'], end=r['market_close'])
idxs.append(idx)
new_index = pd.Index.append(idxs[0], idxs[1:])