Question

我有一个看起来像这样的pandas数据框

enter image description here

这个数据集跨越了几年，是分钟级数据。

我想做的是：每天，在14:40:00到15:00:00之间应用一个能获取所有logvol之和的函数。

我觉得它与重新采样功能有关，但我不确定如何使用它。

我想，也许是：

def fn():
   # not sure how to pass a time slice into the function

data['logvol'].resample('D', how=fn)

或者：

data['logvol'].resample('D', how=lambda x: np.cumsum(x.between_time('14:40:00','15:00:00')))

基本上，我不确定将哪个对象传递给fn（）。它是一排（在这种情况下是分钟）？或者是重新采样日“D”中所有分钟的集合？

非常感谢任何正确方向的提示。

谢谢！

Answer 1

我想通了 - 我用过：

data['logvol'].between_time('14:40:00','15:00:00').resample('D', how='sum')