我有一个如下所示的数据框:
datetime price tickvol bid ask
0 2016-10-11 12:24:03 2130.25 1 2130.00 2130.25
1 2016-10-11 13:31:03 2130.25 1 2130.00 2130.25
...
我有一个看起来像这样的CustomBusinessHour:
cbh = CustomBusinessHour(start='13:30', end='13:15', weekmask='Sun Mon Tue Wed Thu')
我希望我可以使用自定义营业时间的开始时间戳创建新的索引级别,但我无法正常工作。
我希望得到的是:
cbh datetime price tickvol bid ask
2016-10-10 13:30:00 2016-10-11 12:24:03 2130.25 1 2130.00 2130.25
2016-10-11 13:30:00 2016-10-11 13:31:03 2130.25 1 2130.00 2130.25
答案 0 :(得分:0)
这就是我最终要做的事情。它有效,但可能会有所改进。似乎CustomBusinessHour
没有直接暴露任何方式来确定时间是否进入。
def session_start(ts, cbh):
"""Given a timestamp and a CustomBusinessHour, return the session start
timestamp"""
assert type(ts) == pd.Timestamp
spans = spans_midnight(cbh)
t = ts.time()
if spans:
if cbh.end <= t < cbh.start:
return pd.NaT
elif t < cbh.start:
# this timestamp is part of the previous calendar day session
ts = ts.replace(day=ts.day - 1)
else:
if cbh.end <= t or t < cbh.start:
return pd.NaT
return ts.replace(hour=cbh.start.hour, minute=cbh.start.minute,
second=cbh.start.second,
microsecond=cbh.start.microsecond)
# assuming df looks similar to the one in the problem statement...
cbh = CustomBusinessHour(start='06:30', end='13:15',
weekmask='Mon Tue Wed Thu Fri')
df['session_start'] = df.index.map(lambda x: session_start(x, cbh))
df.dropna(how='all', subset=['session_start'], inplace=True)
df.set_index(['session_start', df.index], drop=True, inplace=True)