我想在主事件之间找到“事件发生的总和”,“事件发生的总和A”和事件B。 换句话说,主事件发生,事件A的累积和重置为零的情况。
示例输出如下
@Jon Strutz的示例输入代码
import pandas as pd
df = pd.DataFrame({'year': [2019] * 10,
'month': [8] * 10,
'day': [16] * 10,
'hour': [12, 12, 12, 12, 13, 13, 13, 13, 13, 13],
'minute': [50, 52, 53, 57, 0, 3,4,5,13,21]})
df = pd.DataFrame(pd.to_datetime(df), columns=['Time_Stamp'])
df['Event_Master'] = [0, 0, 1, 0, 0 ,0, 0, 0, 1,0]
df['Event_B'] = [0, 0, 0, 1, 0 ,0, 1, 0, 0,1]
预期的输出可能像
df['Event_Master_Out'] = [0, 0, 1, 1, 1 ,1, 1, 1, 2,2]
df['Event_B_Out'] = [0, 0, 0, 1, 1 ,1, 2, 2, 0,1]
答案 0 :(得分:2)
使用Series.cumsum
,输出用于GroupBy.cumsum
:
df['Event_Master_Out'] = df['Event_Master'].cumsum()
df['Event_B_Out'] = df.groupby('Event_Master_Out')['Event_B'].cumsum()
print (df)
Time_Stamp Event_Master Event_B Event_Master_Out Event_B_Out
0 2019-08-16 12:50:00 0 0 0 0
1 2019-08-16 12:52:00 0 0 0 0
2 2019-08-16 12:53:00 1 0 1 0
3 2019-08-16 12:57:00 0 1 1 1
4 2019-08-16 13:00:00 0 0 1 1
5 2019-08-16 13:03:00 0 0 1 1
6 2019-08-16 13:04:00 0 1 1 2
7 2019-08-16 13:05:00 0 0 1 2
8 2019-08-16 13:13:00 1 0 2 0
9 2019-08-16 13:21:00 0 1 2 1