用熊猫从同一列的另一行中减去行

时间:2018-07-24 10:54:33

标签: python-3.x pandas pandas-groupby

我想使用熊猫从同一列的另一个行值中减去行值。 我的数据框:

holderName    policyItemName    writtenFee    policyNo.   writtenPremium

Robert Nelson   Policy Fee         25        2017-5124     10
Robert Nelson   Policy Fee         25        2017-5124     12
Robert Nelson   policy Fee         25        2017-5124     54
Robert Nelson   Policy Fee         25        2017-5124     123
Karen Jordan    Policy Fee         25        2017-1289     321
Karen Jordan    Policy Fee         25        2017-1289     500
Karen Jordan    Policy Fee         25        2017-1289     400

我想从上至下减去``书面保费'',就像第一行的保费保持不变一样,第二行的``书面保费''要从第三行的保费中减去,这将成为第二行的保费,依此类推。仅针对具有相同“策略编号”的行。答案可以添加到另一列。

需要的输出:

holderName policyItemName writtenFee policyNo. writenPremium  derivedPremium

Robert Nelson   Policy Fee   25     2017-5124   10             10 
Robert Nelson   Policy Fee   25     2017-5124   12             12-10=2
Robert Nelson   Policy Fee   25     2017-5124   54             54-12=42
Robert Nelson   Policy Fee   25     2017-5124   123            123-54=69
Karen Jordan    Policy Fee   25     2017-1289   30             30
Karen Jordan    Policy Fee   25     2017-1289   50             50-30=20
Karen Jordan    Policy Fee   25     2017-1289   40             40-50=-10

非常感谢您提供任何帮助,

1 个答案:

答案 0 :(得分:2)

DataFrameGroupBy.difffillna一起使用,以替换前NaN个:

df['derivedPremium'] = (df.groupby(['policyNo.'])['writtenPremium']
                          .diff()
                          .fillna(df['writtenPremium']))
print (df)

       olderName policyItemName  writtenFee  policyNo.  writtenPremium  \
0  Robert Nelson     Policy Fee          25  2017-5124              10   
1  Robert Nelson     Policy Fee          25  2017-5124              12   
2  Robert Nelson     policy Fee          25  2017-5124              54   
3  Robert Nelson     Policy Fee          25  2017-5124             123   
4   Karen Jordan     Policy Fee          25  2017-1289              30   
5   Karen Jordan     Policy Fee          25  2017-1289              50   
6   Karen Jordan     Policy Fee          25  2017-1289              40   

   derivedPremium  
0            10.0  
1             2.0  
2            42.0  
3            69.0  
4            30.0  
5            20.0  
6           -10.0  

如果仅使用integer的最后一步是转换:

df['derivedPremium'] = (df.groupby(['policyNo.'])['writtenPremium']
                          .diff()
                          .fillna(df['writtenPremium'])
                          .astype(int))
print (df)

       olderName policyItemName  writtenFee  policyNo.  writtenPremium  \
0  Robert Nelson     Policy Fee          25  2017-5124              10   
1  Robert Nelson     Policy Fee          25  2017-5124              12   
2  Robert Nelson     policy Fee          25  2017-5124              54   
3  Robert Nelson     Policy Fee          25  2017-5124             123   
4   Karen Jordan     Policy Fee          25  2017-1289              30   
5   Karen Jordan     Policy Fee          25  2017-1289              50   
6   Karen Jordan     Policy Fee          25  2017-1289              40   

   derivedPremium  
0              10  
1               2  
2              42  
3              69  
4              30  
5              20  
6             -10