根据列顺序对df进行分类

时间:2020-06-09 05:56:08

标签: python pandas dataframe sorting

我必须按C,D,E,B,A的顺序对列进行排序,并按升序对其进行分类。

为-

  • 具有False的D和具有True的Es应该在C列之后具有最高优先级
  • 列Max(B)和Max(A)在C,D和E之后应具有下一个优先级
  • 最后,具有零的C和B应该具有最低的优先级。(应该将D和E排在Max(A)之后

df

        A   B   C      D      E
    0   8   5   0  False   True
    1  45  35   0   True  False
    2  35  10   1  False   True
    3  40   5   2   True  False
    4  12  10   5  False  False
    5  18  15  13  False   True
    6  25  15   5   True  False
    7  35  10  11  False   True
    8  95  50   0  False  False

按以下列顺序排序的条件:

条件订单示例:

Max(C), D, E, Max(B) and Max(A) where as C!= 0  and B!=0
Also Column D with `True` and E with `False` should be on the top 

C, D, E, A, Max(B), and Max (A) where as C== 0 and B !=0
Also Column Ds with `True` and Es with `False` on the top 

C, D, E, A, B, and Max(A) where as C== 0 and B ==0
Also Column Ds with `True` and Es with `False` on the top 

示例分类顺序-

 Category 1 -  C, D(True), E(False), B, A and C!=0
 Category 2 -  C, D(True), E(True), B, A and C!=0
 Category 3 -  C, D(False), E(False), B, A and C!=0
 Category 4 -  C, D(False), E(True), B, A and C!=0

然后考虑C = 0

 Category 5 -  B, D(True), E(False), B, A and B!=0 and C=0
 Category 6 -  B, D(True), E(True), B, A and B!=0 and C=0
 Category 7 -  B, D(False), E(False), B, A and B!=0 and C=0
 Category 8 -  B, D(False), E(True), B, A and B!=0 and C=0

以此类推

我尝试通过更新升序但未获得预期输出的方式来

df.sort_values(['C', 'D','E', 'B', 'A'], ascending=[False, False, False, False, False]) # without category

以下是预期的输出:-

   A   B   C      D      E       Category
6  25  15   5   True  False       1    
3  40   5   2   True  False       1
4  12  10   5  False  False       3
5  18  15  13  False   True       4
7  35  10  11  False   True       4
2  35  10   1  False   True       4
1  45  35   0   True  False       5
8  95  50   0  False  False       6
0   8   5   0  False   True       7

1 个答案:

答案 0 :(得分:1)

使用numpy.selectC,D,E列设置新条件,然后按Category, A, B列按默认升序排序:

#test not equal 0
m0 = df['C'].ne(0) 

#chained True & False
m1 = df['D'] & ~df['E']
#chained True & True 
m2 = df['D'] & df['E']
#chained False & False
m3 = ~df['D'] & ~df['E']
#chained False & True
m4 = ~df['D'] & df['E']

df['Category'] = np.select([m0 & m1, m0 & m2, m0 & m3, m0 & m4,
                            ~m0 & m1, ~m0 & m2, ~m0 & m3, ~m0 & m4], [1,2,3,4,5,6,7,8])

df = df.sort_values(['Category','A','B']) 
print (df)
    A   B   C      D      E  Category
6  25  15   5   True  False         1
3  40   5   2   True  False         1
4  12  10   5  False  False         3
5  18  15  13  False   True         4
2  35  10   1  False   True         4
7  35  10  11  False   True         4
1  45  35   0   True  False         5
8  95  50   0  False  False         7
0   8   5   0  False   True         8