如何在单个数据框中为不同的项目创建持续时间
示例数据集
Item Date OnSale
apple 2017-01-01 Yes
orange 2017-01-01 Yes
orange 2017-01-02 Yes
orange 2017-01-03 No
apple 2017-01-02 No
apple 2017-01-03 No
apple 2017-01-04 No
apple 2017-01-05 Yes
如何计算自项目开始销售以来的天数?
期望的输出
Item Date OnSale DaySinceSale
apple 2017-01-01 Yes 0
orange 2017-01-01 Yes 0
orange 2017-01-02 Yes 0
orange 2017-01-03 No 1
apple 2017-01-02 No 1
apple 2017-01-03 No 2
apple 2017-01-04 No 3
apple 2017-01-05 Yes 0
答案 0 :(得分:2)
尝试
df['DaySinceSale'] = df.groupby('Item')['OnSale'].apply(lambda x: (x == 'No') * (x == 'No').cumsum())
Item Date OnSale DaySinceSale
0 apple 2017-01-01 Yes 0
1 orange 2017-01-01 Yes 0
2 orange 2017-01-02 Yes 0
3 orange 2017-01-03 No 1
4 apple 2017-01-02 No 1
5 apple 2017-01-03 No 2
6 apple 2017-01-04 No 3
7 apple 2017-01-05 Yes 0
您也可以使用series.multiply()
df.groupby('Item')['OnSale'].apply(lambda x: (x == 'No').multiply((x == 'No').cumsum()))