如何应用扩展窗口公式,该公式在Pandas数据框中的日期更改时重新启动?

时间:2015-12-10 19:15:49

标签: python pandas dataframe cumsum

我的数据看起来像这样

Date and Time           Close   dif  Cum_
2015/01/01 17:00:00.211 2030.25 0.3  0.3
2015/01/01 17:00:02.456 2030.75 0.5  0.8
2015/01/01 23:55:01.491 2037.25 2.4  3.2
2015/01/02 00:02:01.955 2036.75 0.4  0.4
2015/01/02 00:04:04.887 2036.5  0.3  0.7
2015/01/02 15:14:56.207 2021.5  4.7  5.0
2015/01/05 17:00:00.020 2021.5  4.7  4.7
2015/01/05 17:00:00.105 2020.5  0    4.7
2015/01/05 17:00:01.077 2021    0.4  5.1

我想做一个每天重置的dif列的cumsum,所以输出看起来像:

{{1}}

由于

1 个答案:

答案 0 :(得分:1)

使用类似的例子:

df = pd.DataFrame({'time': pd.DatetimeIndex(freq='H', start=date(2015,1,1), periods=100), 'value': np.random.random(100)}).set_index('time')
print(df.groupby(pd.TimeGrouper('D')).apply(lambda x: x.cumsum()))


                         value
time                          
2015-01-01 00:00:00   0.112809
2015-01-01 01:00:00   0.175091
2015-01-01 02:00:00   0.257127
2015-01-01 03:00:00   0.711317
2015-01-01 04:00:00   1.372902
2015-01-01 05:00:00   1.544617
2015-01-01 06:00:00   1.748132
2015-01-01 07:00:00   2.547540
2015-01-01 08:00:00   2.799640
2015-01-01 09:00:00   2.913003
2015-01-01 10:00:00   3.883643
2015-01-01 11:00:00   3.926428
2015-01-01 12:00:00   4.045293
2015-01-01 13:00:00   4.214375
2015-01-01 14:00:00   4.456385
2015-01-01 15:00:00   5.374335
2015-01-01 16:00:00   5.828024
2015-01-01 17:00:00   6.295117
2015-01-01 18:00:00   7.171010
2015-01-01 19:00:00   7.907834
2015-01-01 20:00:00   8.132203
2015-01-01 21:00:00   9.007994
2015-01-01 22:00:00   9.755925
2015-01-01 23:00:00  10.373546
2015-01-02 00:00:00   0.797521
2015-01-02 01:00:00   1.582709
2015-01-02 02:00:00   1.811771
2015-01-02 03:00:00   2.493248
2015-01-02 04:00:00   3.278923
2015-01-02 05:00:00   3.626356
...                        ...
2015-01-03 22:00:00  11.625891
2015-01-03 23:00:00  12.597532
2015-01-04 00:00:00   0.075442
2015-01-04 01:00:00   0.155059
2015-01-04 02:00:00   0.754960
2015-01-04 03:00:00   0.926798
2015-01-04 04:00:00   1.890215
2015-01-04 05:00:00   2.734722
2015-01-04 06:00:00   2.803935
2015-01-04 07:00:00   3.103064
2015-01-04 08:00:00   3.727508
2015-01-04 09:00:00   4.117465
2015-01-04 10:00:00   4.250926
2015-01-04 11:00:00   4.996832
2015-01-04 12:00:00   5.081889
2015-01-04 13:00:00   5.493243
2015-01-04 14:00:00   5.987519
2015-01-04 15:00:00   6.719041
2015-01-04 16:00:00   7.325912
2015-01-04 17:00:00   8.163208
2015-01-04 18:00:00   9.015092
2015-01-04 19:00:00   9.062396
2015-01-04 20:00:00   9.350298
2015-01-04 21:00:00   9.947669
2015-01-04 22:00:00  10.820609
2015-01-04 23:00:00  11.165523
2015-01-05 00:00:00   0.385323
2015-01-05 01:00:00   0.999182
2015-01-05 02:00:00   1.240272
2015-01-05 03:00:00   1.398086

因此,在您的示例中,请df.set_index('Date & Time')然后groupbyapply。您当然可以将结果分配回原始DataFrame