根据多个列和条件对数据框进行排序

时间:2019-09-11 14:40:12

标签: python pandas sorting

我正尝试根据rolls降序排列以下数据帧,然后按diff_vto升序为正值,最后按diff_vto升序为负值。这是原始数据框:

    day  prob  vto  rolls  diff  diff_vto
0     1    10   14   27.0   0.0       -13
1     2    10   14   20.0   3.0       -12
2     3     7   14   16.0   4.0       -11
3     4     3   14   12.0  -3.0       -10
4     5     6   14   17.0   3.0        -9
5     6     3   14   14.0  -5.0        -8
6     7     8   14   14.0   5.0        -7
7     8     3   14    9.0   0.0        -6
8     9     3   14    9.0   0.0        -5
9    10     3   14   17.0   0.0        -4
10   11     3   14   22.0  -8.0        -3
11   12    11   14   27.0   3.0        -2
12   13     8   14   23.0   0.0        -1
13   14     8   14   25.0   1.0         0
14   15     7   14   27.0  -3.0         1

这是您希望复制的代码:

    import pandas as pd 
    a = {'day':[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],'prob':[10,10,7,3,6,3,8,3,3,3,3,11,8,8,7],'vto':[14,14,14,14,14,14,14,14,14,14,14,14,14,14,14]}
    df = pd.DataFrame(a)
    df.loc[len(df)+1] = df.loc[0] #Add an extra 2 days for rolling rolling
    df.loc[len(df)+2] = df.loc[1] #Add an extra 2 days for rolling
    df['rolls'] = df['prob'].rolling(3).sum() 
    df['rolls'] = df['rolls'].shift(periods=-2) #Displace rolls to match the index + 2
    df['diff'] = df['prob'].diff(periods=-1) #Prob[i] - Prob[i+1]
    df['diff_vto'] = df['day'] - df['vto'] 
    df = df.head(15)
    print(df)

我希望能够根据rolls(降序),然后对diff_vto的最小值(升序)(升序),然后对{{1 }}为负(升序)时。根据上面发布的数据框,这将是预期的输出:

diff_vto

我显然尝试应用 day prob vto rolls diff diff_vto 14 15 7 14 27.0 -3.0 1 0 1 10 14 27.0 0.0 -13 11 12 11 14 27.0 3.0 -2 13 14 8 14 25.0 1.0 0 12 13 8 14 23.0 0.0 -1 10 11 3 14 22.0 -8.0 -3 1 2 10 14 20.0 3.0 -12 4 5 6 14 17.0 3.0 -9 9 10 3 14 17.0 0.0 -4 2 3 7 14 16.0 4.0 -11 5 6 3 14 14.0 -5.0 -8 6 7 8 14 14.0 5.0 -7 3 4 3 14 12.0 -3.0 -10 7 8 3 14 9.0 0.0 -6 8 9 3 14 9.0 0.0 -5 ,但是我无法在.sort_values()上进行条件排序,因为将其设置为升序显然会将负值放在正值之前。我能给个建议吗?谢谢。

1 个答案:

答案 0 :(得分:2)

您要按diff_vto>0abs(diff_vto)进行排序,两者均递减:

df['pos'] = df['diff_vto'].gt(0)
df['abs'] = df['diff_vto'].abs()

df.sort_values(['rolls', 'pos', 'abs'], ascending=[False, False, False])

输出(如果需要,您可以放下posabs

    day  prob  vto  rolls  diff  diff_vto    pos  abs
14   15     7   14   27.0  -3.0         1   True    1
0     1    10   14   27.0   0.0       -13  False   13
11   12    11   14   27.0   3.0        -2  False    2
13   14     8   14   25.0   1.0         0  False    0
12   13     8   14   23.0   0.0        -1  False    1
10   11     3   14   22.0  -8.0        -3  False    3
1     2    10   14   20.0   3.0       -12  False   12
4     5     6   14   17.0   3.0        -9  False    9
9    10     3   14   17.0   0.0        -4  False    4
2     3     7   14   16.0   4.0       -11  False   11
5     6     3   14   14.0  -5.0        -8  False    8
6     7     8   14   14.0   5.0        -7  False    7
3     4     3   14   12.0  -3.0       -10  False   10
7     8     3   14    9.0   0.0        -6  False    6
8     9     3   14    9.0   0.0        -5  False    5