样本数据
date_rng = pd.date_range('2019-01-01', freq='s', periods=400)
df = pd.DataFrame(np.random.lognormal(.005, .5, size=(len(date_rng))),
columns=['data1'],
index=date_rng)
示例输入
data1
2019-01-01 00:00:00 1
2019-01-01 00:00:01 -2
2019-01-01 00:00:02 2
2019-01-01 00:00:03 3
2019-01-01 00:00:04 1
2019-01-01 00:00:05 2
2019-01-01 00:00:06 -1
2019-01-01 00:00:07 3
2019-01-01 00:00:08 4
2019-01-01 00:00:09 5
2019-01-01 00:00:10 7
2019-01-01 00:00:11 2
2019-01-01 00:00:12 4
2019-01-01 00:00:13 -1
2019-01-01 00:00:14 5
2019-01-01 00:00:15 3
2019-01-01 00:00:16 5
2019-01-01 00:00:17 -3
... ...
预期产量
data1 cumsum
2019-01-01 00:00:00 1 1
2019-01-01 00:00:01 -2 -1
2019-01-01 00:00:02 2 1
2019-01-01 00:00:03 3 4
2019-01-01 00:00:04 1 5 (reset cumsum at this point)
2019-01-01 00:00:05 2 2
2019-01-01 00:00:06 -1 1
2019-01-01 00:00:07 3 4
2019-01-01 00:00:08 4 8 (reset at this point)
2019-01-01 00:00:09 5 5 (reset at this point)
2019-01-01 00:00:10 7 7 (reset at this point)
2019-01-01 00:00:11 2 2
2019-01-01 00:00:12 4 6 (reset at this point)
2019-01-01 00:00:13 -1 -1
2019-01-01 00:00:14 5 4
2019-01-01 00:00:15 3 7 (reset at this point)
2019-01-01 00:00:16 1 1
2019-01-01 00:00:17 -3 -2
... ...
我想根据data1
的累积总和来计算条件,如果{5}大于5且超过上次重置的20%,它将重置。例如:对于第一个计算,仅当它大于5时,它才会重置。在那之后,它将基于这两个条件重置。
我已经检查了关于stackoverflow的其他答案,但是没有发现类似的问题。请为我提供有关如何解决此问题的建议。