我试图对我的数据框进行增量计数,但我需要应用某些条件。
原创DF:
df = pd.DataFrame(
[[123,'1/1','Yes'],
[123,'1/1','Yes'],
[123,'1/1','No'],
[123,'1/1','No'],
[123,'1/1','No'],
[123,'1/1','Yes'],
[123,'1/2','Yes'],
[123,'1/1','No'],
[123,'1/2','No'],
[123,'1/2','Yes'],
[123,'1/2','No'],
[123,'1/2','Yes']
], columns=['AcctId','Date','Yes_No'])
df
AcctId Date Yes_No
123 1/1 Yes
123 1/1 Yes
123 1/1 No
123 1/1 No
123 1/1 No
123 1/1 Yes
123 1/2 Yes
123 1/2 No
123 1/2 No
123 1/2 Yes
123 1/2 No
123 1/2 Yes
条件:
如果Yes_No == 'No'
则Counter = 0
如果Yes_No == 'Yes'
则Counter = 1
如果Yes_No == 'Yes'
和之前的Yes_No == 'Yes'
,则不会增加Counter
如果Yes_No == 'Yes'
和之前的Yes_No == 'No'
然后将Counter
增加1
如果Date
发生变化,请重置Counter
我最接近的是以下内容,但它无法考虑Date
或Yes_No == 'No'
:
df['Counter'] = (df['Yes_No'] != df['Yes_No'].shift(1)).astype(int).cumsum()
首选输出:
df
AcctId Date Yes_No Counter
123 1/1 Yes 1 #Set Counter = 1 since this is first instance where 'Yes_No' == 'Yes' on 1/1
123 1/1 Yes 1 #Keep Counter = 1 since previous row 'Yes_No' == 'Yes'
123 1/1 No 0 #Set Counter = 0 since 'Yes_No' == 'No'
123 1/1 No 0 #Keep Counter = 0 since 'Yes_No' == 'No'
123 1/1 No 0 #Keep Counter = 0 since 'Yes_No' == 'No'
123 1/1 Yes 2 #Increase counter to 2 since this is 2nd non-consecutive instance of 'Yes' on 1/1
123 1/2 Yes 1 #Reset Counter to 1 since 'Date' changed
123 1/2 No 0
123 1/2 No 0
123 1/2 Yes 2 #Increase Counter to 2 since this is 2nd non-consecutive instance of 'Yes' on 1/2
123 1/2 No 0
123 1/2 Yes 3 #Increase Counter to 3 since this is 3rd non-consecutive instance of 'Yes' on 1/2
提前致谢!