我有一个数据框
I_Code Date_1 Date_2 C_B
4 01/09/2019 02/08/2019 7
4 01/09/2019 03/08/2019 1189
1 02/09/2019 03/08/2019 3
2 02/09/2019 03/08/2019 4
1 01/09/2019 04/08/2019 8
4 01/09/2019 04/08/2019 1650
1 02/09/2019 04/08/2019 138
2 02/09/2019 04/08/2019 8
1 03/09/2019 04/08/2019 6
2 03/09/2019 04/08/2019 2
1 01/09/2019 05/08/2019 9
4 01/09/2019 05/08/2019 1805
1 02/09/2019 05/08/2019 259
2 02/09/2019 05/08/2019 1062
1 03/09/2019 05/08/2019 4
2 03/09/2019 05/08/2019 161
2 04/09/2019 05/08/2019 7
5 04/09/2019 05/08/2019 12
1 01/09/2019 06/08/2019 63
2 01/09/2019 06/08/2019 2
我想计算为每个date_1和I_code计算的列c_b1,这样对于每个date_1和I_Code我想为下一个条目减去C_b。例如-
I_Code Date_1 Date_2 C_B C_B1
4 01/09/2019 02/08/2019 7 7
4 01/09/2019 03/08/2019 1189 1189-7 (Since date_1 and I_Code are same in entry 1)
1 02/09/2019 03/08/2019 3 3
2 02/09/2019 03/08/2019 4 4
1 01/09/2019 04/08/2019 8 8
4 01/09/2019 04/08/2019 1650 1650-1189 (Since date_1 and I_Code are same in entry 2)
1 02/09/2019 04/08/2019 138 138-3
2 02/09/2019 04/08/2019 8 8-3
任何人都可以帮助实现相同目标。
谢谢
答案 0 :(得分:2)
使用DataFrameGroupBy.shift
,将缺失值替换为0
并减去:
df['C_B1'] = df['C_B'] - df.groupby(['I_Code','Date_1'])['C_B'].shift().fillna(0)
使用Series.sub
和fill_value=0
参数的替代方法:
df['C_B1'] = df['C_B'].sub(df.groupby(['I_Code','Date_1'])['C_B'].shift(), fill_value=0)
print (df)
I_Code Date_1 Date_2 C_B C_B1
0 4 01/09/2019 02/08/2019 7 7.0
1 4 01/09/2019 03/08/2019 1189 1182.0
2 1 02/09/2019 03/08/2019 3 3.0
3 2 02/09/2019 03/08/2019 4 4.0
4 1 01/09/2019 04/08/2019 8 8.0
5 4 01/09/2019 04/08/2019 1650 461.0
6 1 02/09/2019 04/08/2019 138 135.0
7 2 02/09/2019 04/08/2019 8 4.0
8 1 03/09/2019 04/08/2019 6 6.0
9 2 03/09/2019 04/08/2019 2 2.0
10 1 01/09/2019 05/08/2019 9 1.0
11 4 01/09/2019 05/08/2019 1805 155.0
12 1 02/09/2019 05/08/2019 259 121.0
13 2 02/09/2019 05/08/2019 1062 1054.0
14 1 03/09/2019 05/08/2019 4 -2.0
15 2 03/09/2019 05/08/2019 161 159.0
16 2 04/09/2019 05/08/2019 7 7.0
17 5 04/09/2019 05/08/2019 12 12.0
18 1 01/09/2019 06/08/2019 63 54.0
19 2 01/09/2019 06/08/2019 2 2.0
答案 1 :(得分:1)
IIUC:
df["C_B_1"] = df.groupby(["Date_1","I_Code"])["C_B"].apply(lambda x: x-x.shift()).fillna(df['C_B'])
print (df)
#
I_Code Date_1 Date_2 C_B C_B_1
0 4 01/09/2019 02/08/2019 7 7.0
1 4 01/09/2019 03/08/2019 1189 1182.0
2 1 02/09/2019 03/08/2019 3 3.0
3 2 02/09/2019 03/08/2019 4 4.0
4 1 01/09/2019 04/08/2019 8 8.0
5 4 01/09/2019 04/08/2019 1650 461.0
6 1 02/09/2019 04/08/2019 138 135.0
7 2 02/09/2019 04/08/2019 8 4.0
8 1 03/09/2019 04/08/2019 6 6.0
9 2 03/09/2019 04/08/2019 2 2.0
10 1 01/09/2019 05/08/2019 9 1.0
11 4 01/09/2019 05/08/2019 1805 155.0
12 1 02/09/2019 05/08/2019 259 121.0
13 2 02/09/2019 05/08/2019 1062 1054.0
14 1 03/09/2019 05/08/2019 4 -2.0
15 2 03/09/2019 05/08/2019 161 159.0
16 2 04/09/2019 05/08/2019 7 7.0
17 5 04/09/2019 05/08/2019 12 12.0
18 1 01/09/2019 06/08/2019 63 54.0
19 2 01/09/2019 06/08/2019 2 2.0