daychange SS
0.017065 0
-0.009259 100
0.031542 0
-0.004530 0
0.000709 0
0.004970 100
-0.021900 0
0.003611 0
我有两列,如果SS = 100,我想计算下5个“ daychange”的总和。
我现在正在使用以下内容,但是它并不能完全按照我想要的方式工作:
df['total'] = df.loc[df['SS'] == 100,['daychange']].sum(axis=1)
答案 0 :(得分:3)
自pandas 1.1
起,您可以创建forward rolling window并选择要包含在数据框中的行。我的笔记本内核因不同的论点而终止:请谨慎使用。
indexer = pd.api.indexers.FixedForwardWindowIndexer(window_size=5)
df['total'] = df.daychange.rolling(indexer, min_periods=1).sum()[df.SS == 100]
df
出局:
daychange SS total
0 0.017065 0 NaN
1 -0.009259 100 0.023432
2 0.031542 0 NaN
3 -0.004530 0 NaN
4 0.000709 0 NaN
5 0.004970 100 -0.013319
6 -0.021900 0 NaN
7 0.003611 0 NaN
SS == 100
的起始行这将是带有SS == 100
的行之后的下一行。计算所有行后,您可以使用
df['total'] = df.daychange.rolling(indexer, min_periods=1).sum().shift(-1)[df.SS == 100]
df
出局:
daychange SS total
0 0.017065 0 NaN
1 -0.009259 100 0.010791
2 0.031542 0 NaN
3 -0.004530 0 NaN
4 0.000709 0 NaN
5 0.004970 100 -0.018289
6 -0.021900 0 NaN
7 0.003611 0 NaN
感觉就像是骇客,但有效并且避免了计算不必要的滚动值
df['next5sum'] = df[df.SS == 100].index.to_series().apply(lambda x: df.daychange.iloc[x: x + 5].sum())
df
出局:
daychange SS next5sum
0 0.017065 0 NaN
1 -0.009259 100 0.023432
2 0.031542 0 NaN
3 -0.004530 0 NaN
4 0.000709 0 NaN
5 0.004970 100 -0.013319
6 -0.021900 0 NaN
7 0.003611 0 NaN
对于不包括SS == 100
的行,接下来的五行之和,您可以调整切片或移动序列
df['next5sum'] = df[df.SS == 100].index.to_series().apply(lambda x: df.daychange.iloc[x + 1: x + 6].sum())
# df['next5sum'] = df[df.SS == 100].index.to_series().apply(lambda x: df.daychange.shift(-1).iloc[x: x + 5].sum())
df
出局:
daychange SS next5sum
0 0.017065 0 NaN
1 -0.009259 100 0.010791
2 0.031542 0 NaN
3 -0.004530 0 NaN
4 0.000709 0 NaN
5 0.004970 100 -0.018289
6 -0.021900 0 NaN
7 0.003611 0 NaN
7 0.003611 0 NaN