如何将自定义排序应用于多索引列

时间:2019-08-19 15:17:35

标签: python pandas

我有一个类似于以下内容的多索引(列)数据帧:

my_frame = pd.DataFrame(data={'a':[1,2,3,4],'b':[5,6,7,8],'c':[9,10,11,12], 'd':[13,14,15,16],
                              'subcolumn_1':['A1','A1','A2','A2'],
                              'subcolumn_2':['B1','B2','B1','B2']})
my_frame.set_index(keys=['subcolumn_1','subcolumn_2'], inplace=True)
my_frame = my_frame.transpose()


subcolumn_1 A1  A2
subcolumn_2 B1  B2  B1  B2
a   1   2   3   4
b   5   6   7   8
c   9   10  11  12
d   13  14  15  16

我想对subcolumn_2进行排序,但不按字母数字排序,而是对自定义列表进行排序,例如下面的伪代码。

my_frame.sort_subcolumn_2(neworder=["B2","B1"])

subcolumn_1 A1  A2
subcolumn_2 B2  B1  B2  B1
a   2   1   4   3
b   6   5   8   7
c   10  9   12  11
d   14  13  16  15

编辑:我的用例要求在新索引之后进行排序 已设置。当前的解决方案要求我先进行设置 索引

2 个答案:

答案 0 :(得分:2)

一种实现此目的的方法是将subcolumn_2转换为有序的类别,

$SourceDir = $env:TEMP
$Filter = '*.ps1'

$FileList = Get-ChildItem -LiteralPath $SourceDir -Filter $Filter -File -Recurse

foreach ($FL_Item in $FileList)
    {
    if ($FL_Item.Directory.Name -eq $FL_Item.BaseName)
        {
        '{0} - parent dir & file base names are the same' -f $FL_Item.FullName
        }
        else
        {
        Write-Warning ('    {0} has different parent dir and file base names' -f $FL_Item.FullName)
        }
    }

尝试

C:\Temp\AlsoSameName\AlsoSameName.ps1 - parent dir & file base names are the same
WARNING:     C:\Temp\DiffName\Testing.ps1 has different parent dir and file base names
C:\Temp\SameName\SameName.ps1 - parent dir & file base names are the same

答案 1 :(得分:0)

我可能会尝试使用np.argsort来获得所需的新行顺序,

然后在转置前使用.iloc对其进行排序。

尝试一下:

import pandas as pd
import numpy as np

my_frame = pd.DataFrame(data={'a': [1, 2, 3], 'b': [7, 8, 9], 'c': [4, 5, 6],
                              'subcolumn_1': ['A1', 'A2', 'A3'],
                              'subcolumn_2': ['B1', 'B2', 'B3']})

neworder = ["B2", "B1", "B3"]
print(my_frame.iloc[np.argsort(neworder)].set_index(keys=['subcolumn_1', 'subcolumn_2']).transpose())

输出:

subcolumn_1 A2 A1 A3
subcolumn_2 B2 B1 B3
a            2  1  3
b            8  7  9
c            5  4  6