过滤数据框并将新创建的列添加到原始df

时间:2020-07-27 20:33:48

标签: python python-3.x pandas dataframe for-loop

是否有一种简单的方法来依次对每个水果进行计算,将新创建的列添加到原始df中?

df
 concatted  score      fruit        status   date              
 apple_bana  0.500      apple       high    2010-02-20         
      apple  0.600      apple      low     2010-02-21          
     banana  0.530      pear       low     2010-01-12        
Expected output:
 concatted  score      fruit        status   date              first_diff  
 apple_bana  0.500      apple       high    2010-02-20                     
      apple  0.600      apple      low     2010-02-21            0.1
     banana  0.530      pear       low     2010-01-12        
I tried:
fruits = ['apple', 'banana', 'pair']
for fruit in fruits :
    selected_rows = df[(df['fruit'] == fruit)]
    selected_rows['first_diff']= df.score.diff().dropna()
    df = df.append(selected_rows)

1 个答案:

答案 0 :(得分:2)

groupby(),然后应用.diff()得分

df['first_diff']=df[['concatted', 'score', 'fruit', 'status', 'date']].groupby('fruit')['score'].diff().fillna('')

如果需要一般的东西,请尝试;

df['first_diff']=df[[x for x in df.columns]].groupby('fruit')['score'].diff().fillna('')

     concatted  score  fruit status    date       first_diff
0  apple_bana   0.50  apple   high  2010-02-20           
1       apple   0.60  apple    low  2010-02-21        0.1
2      banana   0.53   pear    low  2010-01-12   
相关问题