该函数将返回数据框中每列中前3个最高条目的索引

时间:2019-10-23 14:48:40

标签: pandas dataframe

如果我具有以下数据框

           Base Pay   Overtime Pay    Other Pay   Benefits   
Adam        200000       31000          5000       64000
Ben         210000       27000          7000       57000
Scott       190000       40000          9000       65000
David       220000       26000          4000       61000
Matthew     195000       29000         10000       63000
Mark        205000       37000          8000       59000

然后我要返回以下数据框

                     1st      2nd     3rd
Base Pay            David    Ben     Mark
Overtime Pay        Scott    Mark     Adam  
Other Pay          Matthew  Scott     Mark
Benefits            Scott    Adam    Matthew

我知道如何计算每列中的3个最大值,但不能同时计算。

3 个答案:

答案 0 :(得分:3)

使用argsort

df = pd.DataFrame(...).T

result = pd.DataFrame(df.columns[(-df.values).argsort(axis=1)[:, :3]],
                      columns=["1st","2nd","3rd"],
                      index=df.index)

print (result)

#
                 1st    2nd      3rd
BasePay        David    Ben     Mark
OvertimePay    Scott   Mark     Adam
OtherPay     Matthew  Scott     Mark
Benefits       Scott   Adam  Matthew

答案 1 :(得分:2)

短一点:

df = df.T.apply(lambda s: s.abs().nlargest(3).index.tolist(), axis=1)
>>> df2 = pd.DataFrame()
>>> df2[['1st','2nd','3rd']] = pd.DataFrame(df.values.tolist(), index= df.index)

答案 2 :(得分:1)

这是一种方法

s=df.stack().sort_values(ascending=False).groupby(level=1).head(3).reset_index()
s['Id']=s.groupby('level_1').cumcount()+1
s.pivot(index='level_1',columns='Id',values='level_0')
Out[114]: 
Id                 1      2        3
level_1                             
BasePay        David    Ben     Mark
Benefits       Scott   Adam  Matthew
OtherPay     Matthew  Scott     Mark
OvertimePay    Scott   Mark     Adam