基于CustomBusinessHour start创建pandas multiindex

时间:2016-10-19 05:47:16

标签: python pandas

我有一个如下所示的数据框:

             datetime    price  tickvol      bid      ask
0 2016-10-11 12:24:03  2130.25        1  2130.00  2130.25
1 2016-10-11 13:31:03  2130.25        1  2130.00  2130.25
...

我有一个看起来像这样的CustomBusinessHour:

cbh = CustomBusinessHour(start='13:30', end='13:15', weekmask='Sun Mon Tue Wed Thu')

我希望我可以使用自定义营业时间的开始时间戳创建新的索引级别,但我无法正常工作。

我希望得到的是:

                cbh            datetime    price  tickvol      bid      ask
2016-10-10 13:30:00 2016-10-11 12:24:03  2130.25        1  2130.00  2130.25
2016-10-11 13:30:00 2016-10-11 13:31:03  2130.25        1  2130.00  2130.25

1 个答案:

答案 0 :(得分:0)

这就是我最终要做的事情。它有效,但可能会有所改进。似乎CustomBusinessHour没有直接暴露任何方式来确定时间是否进入。

def session_start(ts, cbh):
    """Given a timestamp and a CustomBusinessHour, return the session start
    timestamp"""
    assert type(ts) == pd.Timestamp
    spans = spans_midnight(cbh)
    t = ts.time()
    if spans:
        if cbh.end <= t < cbh.start:
            return pd.NaT
        elif t < cbh.start:
            # this timestamp is part of the previous calendar day session
            ts = ts.replace(day=ts.day - 1)
    else:
        if cbh.end <= t or t < cbh.start:
            return pd.NaT

    return ts.replace(hour=cbh.start.hour, minute=cbh.start.minute,
                      second=cbh.start.second,
                      microsecond=cbh.start.microsecond)


# assuming df looks similar to the one in the problem statement...
cbh = CustomBusinessHour(start='06:30', end='13:15',
                         weekmask='Mon Tue Wed Thu Fri')
df['session_start'] = df.index.map(lambda x: session_start(x, cbh))
df.dropna(how='all', subset=['session_start'], inplace=True)
df.set_index(['session_start', df.index], drop=True, inplace=True)