Question

我有一个熊猫表，例如：

Entries Col1    Col2    Col3    Col4
Entry1  -1.46   93.93   3.33    92.51   
Entry2  -48.59  31.49   -22.97  80.25
Entry3  8.24    95.85   -5.05   90.29

我想基于所有4列对条目进行排序。在第1列和第3列的排名应接近0，在第2列和第4列的排名应最高。

此刻我有这样的东西：

data.sort_values(cols, ascending=[False,True,False,True],inplace=True)

但是所有这些操作都是按照第一列进行排序，而其他列对排序无关紧要。我需要按所有列对条目进行排序。如果第1列中条目1仅是最好的，而第2列中条目2是最好的，则应将条目3排在最前面。

预期输出：

Entries Col1    Col2    Col3    Col4
Entry1  -1.46   93.93   3.33    92.51   
Entry3  8.24    95.85   -5.05   90.29
Entry2  -48.59  31.49   -22.97  80.25

条目1在Col：1、3和4中最好所有Col的条目2都更糟条目3在Col2中排名第一，在其他排名中排名第二。

谢谢。

Answer 1

从创建2个辅助列开始，即 Col1a 和 Col3a 各个源列的绝对值：

(A fatal error has been detected by the Java Runtime Environment:
 SIGBUS (0x7) at pc=0x00007f904544b12f, pid=6446, tid=6447JRE version: OpenJDK Runtime 
 Environment (11.0.5+10) (build 11.0.5+10-post-Debian-2)
 Java VM: OpenJDK 64-Bit Server VM (11.0.5+10-post-Debian-2, mixed mode, sharing, tiered, 
 compressed oops, g1 gc, linux-amd64)
 Problematic frame:
 V  [libjvm.so+0xcce12f]
 No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
 If you would like to submit a bug report, please visit:
 (https://bugs.debian.org/openjdk-11)

Aborted

排序您的DataFrame：

data['Col1a'] = data.Col1.abs()
data['Col3a'] = data.Col3.abs()

请注意，升参数与代码中的参数不同。

最后，删除辅助列：

data.sort_values(['Col1a', 'Col2', 'Col3a',  'Col4'],
    ascending=[True, False, True, False], inplace=True)

如何基于多列对熊猫数据框进行排序/排名

1 个答案: