熊猫-按一列分组,然后按其余所有列排序

时间:2020-07-22 00:35:21

标签: python pandas

我需要按“ player_slug”对数据框进行分组,然后然后对每个(数字)“平均值”列的所有列进行排序。

请注意,列值已经是平均值。

这是df.head(5)

   player_slug  player_id player_nickname  player_team player_position  ...   DD_mean   DP_mean    status  price_diff  last_points
0   paulo-andre      37604     Paulo André          293             zag  ...  0.000000  0.000000  Provável        0.11          1.7
1       evandro      37614         Evandro          277             mei  ...  0.000000  0.000000    Dúvida       -1.78          2.8
2         betao      37646           Betão          314             zag  ...  0.000000  0.000000  Provável       -0.14          0.1
3  rafael-moura      37655    Rafael Moura          290             ata  ...  0.000000  0.000000  Provável        2.89         22.2
4         fabio      37656           Fábio          283             gol  ...  1.257143  0.057143  Provável        0.42          2.0

我试图创建一个函数并传递所有功能,就像这样:

 columns = ['score_mean','score_no_cleansheets_mean','diff_home_away_s',
            'n_games','score_mean_home','score_mean_away','shots_x_mean','fouls_mean','RB_mean',
            'PE_mean','A_mean','I_mean','FS_mean','FF_mean','G_mean','DD_mean','DP_mean',
            'price_diff','last_points']

def sorted_medias(df, feature=None):
    df_agg = df.groupby(['player_slug', 'player_team']).agg({feature:'sum'}).sort_values(feature, ascending=False)
    print (df_agg)

最后:

for feature in columns:
   sorted_medias(df_medias, feature)

但是我不确定在agg中使用'sum'或'mean',因为值已经是意思。

去这里的路是什么?

1 个答案:

答案 0 :(得分:0)

看起来这就是OP所要求的。按玩家分组并在组内选择任何值,因为这些值已经聚合。

df.groupby(['player_slug'])['goals'].min().sort_values(ascending=False)