如果我具有以下数据框
Base Pay Overtime Pay Other Pay Benefits
Adam 200000 31000 5000 64000
Ben 210000 27000 7000 57000
Scott 190000 40000 9000 65000
David 220000 26000 4000 61000
Matthew 195000 29000 10000 63000
Mark 205000 37000 8000 59000
然后我要返回以下数据框
1st 2nd 3rd
Base Pay David Ben Mark
Overtime Pay Scott Mark Adam
Other Pay Matthew Scott Mark
Benefits Scott Adam Matthew
我知道如何计算每列中的3个最大值,但不能同时计算。
答案 0 :(得分:3)
使用argsort
:
df = pd.DataFrame(...).T
result = pd.DataFrame(df.columns[(-df.values).argsort(axis=1)[:, :3]],
columns=["1st","2nd","3rd"],
index=df.index)
print (result)
#
1st 2nd 3rd
BasePay David Ben Mark
OvertimePay Scott Mark Adam
OtherPay Matthew Scott Mark
Benefits Scott Adam Matthew
答案 1 :(得分:2)
短一点:
df = df.T.apply(lambda s: s.abs().nlargest(3).index.tolist(), axis=1)
>>> df2 = pd.DataFrame()
>>> df2[['1st','2nd','3rd']] = pd.DataFrame(df.values.tolist(), index= df.index)
答案 2 :(得分:1)
这是一种方法
s=df.stack().sort_values(ascending=False).groupby(level=1).head(3).reset_index()
s['Id']=s.groupby('level_1').cumcount()+1
s.pivot(index='level_1',columns='Id',values='level_0')
Out[114]:
Id 1 2 3
level_1
BasePay David Ben Mark
Benefits Scott Adam Matthew
OtherPay Matthew Scott Mark
OvertimePay Scott Mark Adam