我有一个数据框:
Date_1 Date_2 is_B weight_1
01/09/2019 02/08/2019 1 254
01/09/2019 02/08/2019 1 320
01/09/2019 04/08/2019 1 244
01/09/2019 04/08/2019 1 247
01/09/2019 14/08/2019 0 343
01/09/2019 14/08/2019 1 161
01/09/2019 14/08/2019 1 386
01/09/2019 15/08/2019 1 465
01/09/2019 15/08/2019 1 133
01/09/2019 15/08/2019 1 310
01/09/2019 15/08/2019 1 155
我想生成一列new_weight,以便对于每个date_1,new_weight的值为5000-weight_1,其中is_B值为1。如果is_B = 0,则将较旧的new_weight值复制到new_weight中。
我知道要计算new_weight,我们可以做到:
df = 5000 - df.groupby('date_1')['weight_1'].cumsum()
但是我不确定如何在代码中应用is_b的条件。
有人能建议用熊猫还是麻木的方式做同样的事吗?
编辑
预期输出
Date_1 Date_2 is_B weight_1 new_weight
01/09/2019 02/08/2019 1 254 5000-254
01/09/2019 02/08/2019 1 320 5000-254-320
01/09/2019 04/08/2019 1 244 5000-254-320-244
01/09/2019 04/08/2019 1 247 5000-254-320-244-247
01/09/2019 14/08/2019 0 343 5000-254-320-244-247(we won't subtract 343 as isBooked = 0)
01/09/2019 14/08/2019 1 161 .
01/09/2019 14/08/2019 1 386 .
01/09/2019 15/08/2019 1 465 .
01/09/2019 15/08/2019 1 133 .
01/09/2019 15/08/2019 1 310 .
01/09/2019 15/08/2019 1 155 .
谢谢
答案 0 :(得分:1)
尝试一下:
df['new_weight'] = df.groupby('date_1').apply(lambda grp:
5000 - grp.weight_1.where(grp.isBooked.eq(1), 0).cumsum()).reset_index(level=0, drop=True)
答案 1 :(得分:1)
看来您只需要在groupby之前进行简单的乘法即可:
df['new_weight'] = 5000 - (df['weight_1'].mul(df['is_B'])
.groupby(df['Date_1'])
.cumsum()
)
输出:
Date_1 Date_2 is_B weight_1 new_weight
0 01/09/2019 02/08/2019 1 254 4746
1 01/09/2019 02/08/2019 1 320 4426
2 01/09/2019 04/08/2019 1 244 4182
3 01/09/2019 04/08/2019 1 247 3935
4 01/09/2019 14/08/2019 0 343 3935
5 01/09/2019 14/08/2019 1 161 3774
6 01/09/2019 14/08/2019 1 386 3388
7 01/09/2019 15/08/2019 1 465 2923
8 01/09/2019 15/08/2019 1 133 2790
9 01/09/2019 15/08/2019 1 310 2480
10 01/09/2019 15/08/2019 1 155 2325
答案 2 :(得分:1)
您可以使用DataFrame.mask
+ Series.cumsum
:
df['new_weight']=5000-(df.mask(df['is_B'].eq(0)).groupby('Date_1')['weight_1'].cumsum()).ffill()
print(df)
Date_1 Date_2 is_B weight_1 new_weight
0 01/09/2019 02/08/2019 1 254 4746.0
1 01/09/2019 02/08/2019 1 320 4426.0
2 01/09/2019 04/08/2019 1 244 4182.0
3 01/09/2019 04/08/2019 1 247 3935.0
4 01/09/2019 14/08/2019 0 343 3935.0
5 01/09/2019 14/08/2019 1 161 3774.0
6 01/09/2019 14/08/2019 1 386 3388.0
7 01/09/2019 15/08/2019 1 465 2923.0
8 01/09/2019 15/08/2019 1 133 2790.0
9 01/09/2019 15/08/2019 1 310 2480.0
10 01/09/2019 15/08/2019 1 155 2325.0
答案 3 :(得分:0)
这将在新列(“ New_weight”)中为您提供所需的值:
df.loc[df.is_B == 0, 'new_weight'] = df['weight_1']
df.loc[df.is_B == 1, 'new_weight'] = 5000 - df.groupby('Date_1')['weight_1'].cumsum()
不确定这是否回答“如果is_B = 0,那么我们会将旧值new_weight复制到new_weight中。”