根据前几列中的值的排名在数据框中追加列

时间:2018-06-22 23:26:55

标签: python pandas sorting dataframe

我的数据框如下:

      Date       AAPL      NFLX       INTC    
20 2008-01-31  27.834286  3.764286  25.350000            
40 2008-02-29  27.847143  3.724286  24.670000   
60 2008-03-31   27.834286  3.764286  25.350000     

想象一下,这些都是上面的%回报。如何在3列中对值进行排名,以使数据框现在看起来像这样:

      Date       AAPL      NFLX       INTC      Rank_Max  Rank_Min
20 2008-01-31  27.834286  3.764286  25.350000     AAPL      NFLX        
40 2008-02-29  27.847143  33.724286  24.670000    NFLX      INTC
60 2008-03-31   27.834286  3.764286  25.350000    etc

谢谢。

2 个答案:

答案 0 :(得分:2)

首先,找到等级(此函数附带地过滤掉所有非数字列):

ranks = df.rank(axis=1, numeric_only=True)

下一步,找到最小和最大等级的索引:

df['Rank_Max'] = ranks.idxmax(axis=1)
df['Rank_Min'] = ranks.idxmin(axis=1)
df
#          Date       AAPL       NFLX   INTC Rank_Max Rank_Min
#20  2008-01-31  27.834286   3.764286  25.35     AAPL     NFLX
#40  2008-02-29  27.847143  33.724286  24.67     NFLX     INTC
#60  2008-03-31  27.834286   3.764286  25.35     AAPL     NFLX

答案 1 :(得分:0)

使用idxmaxidxmin

df['Rank_Max'] = df[['AAPL', 'NFLX', 'INTC']].idxmax(axis=1)
df['Rank_Min'] = df[['AAPL', 'NFLX', 'INTC']].idxmin(axis=1)

print(df)

          Date       AAPL      NFLX   INTC Rank_Max Rank_Min
20  2008-01-31  27.834286  3.764286  25.35     AAPL     NFLX
40  2008-02-29  27.847143  3.724286  24.67     AAPL     NFLX
60  2008-03-31  27.834286  3.764286  25.35     AAPL     NFLX