是否有一种简单的方法来依次对每个水果进行计算,将新创建的列添加到原始df中?
df
concatted score fruit status date
apple_bana 0.500 apple high 2010-02-20
apple 0.600 apple low 2010-02-21
banana 0.530 pear low 2010-01-12
Expected output:
concatted score fruit status date first_diff
apple_bana 0.500 apple high 2010-02-20
apple 0.600 apple low 2010-02-21 0.1
banana 0.530 pear low 2010-01-12
I tried:
fruits = ['apple', 'banana', 'pair']
for fruit in fruits :
selected_rows = df[(df['fruit'] == fruit)]
selected_rows['first_diff']= df.score.diff().dropna()
df = df.append(selected_rows)
答案 0 :(得分:2)
groupby()
,然后应用.diff()
得分
df['first_diff']=df[['concatted', 'score', 'fruit', 'status', 'date']].groupby('fruit')['score'].diff().fillna('')
如果需要一般的东西,请尝试;
df['first_diff']=df[[x for x in df.columns]].groupby('fruit')['score'].diff().fillna('')
concatted score fruit status date first_diff
0 apple_bana 0.50 apple high 2010-02-20
1 apple 0.60 apple low 2010-02-21 0.1
2 banana 0.53 pear low 2010-01-12