Question

如何在尊重级别组织的同时对多索引数据框进行排序？

E.g。给出以下df，请说我们根据C对其进行排序（例如按降序排列）：

                   C         D  E
A    B                           
bar  one   -0.346528  1.528538  1
     three -0.136710 -0.147842  1
flux six    0.795641 -1.610137  1
     three  1.051926 -1.316725  2
foo  five   0.906627  0.717922  0
     one   -0.152901 -0.043107  2
     two    0.542137 -0.373016  2
     two    0.329831  1.067820  1

我们应该得到：

                   C         D  E
A    B                           
bar  three -0.136710 -0.147842  1
     one   -0.346528  1.528538  1
flux three  1.051926 -1.316725  2
     six    0.795641 -1.610137  1
foo  five   0.906627  0.717922  0
     two    0.542137 -0.373016  2
     two    0.329831  1.067820  1
     two   -0.152901 -0.043107  2

请注意，我的意思是＆＃34;尊重其索引结构＆＃34;正在排序数据帧的叶子而不改变更高级别索引的顺序。换句话说，我想在保持第一级的顺序不变的同时对第二级进行排序。

如何在升序顺序中做同样的事情？

我读了这两个主题（是的，标题相同）：

但他们根据不同的标准（例如索引名称或组中的特定列）对数据框进行排序。

Answer 1

.reset_index，然后根据列A和C排序，然后设置索引;这比早期的groupby解决方案更有效：

>>> df.reset_index().sort(columns=['A', 'C'], ascending=[True, False]).set_index(['A', 'B'])
                C      D  E
A    B                     
bar  three -0.137 -0.148  1
     one   -0.347  1.529  1
flux three  1.052 -1.317  2
     six    0.796 -1.610  1
foo  five   0.907  0.718  0
     two    0.542 -0.373  2
     two    0.330  1.068  1
     one   -0.153 -0.043  2

早期解决方案：.groupby(...).apply相对较慢，可能无法很好地扩展：

>>> df['arg-sort'] = df.groupby(level='A')['C'].apply(pd.Series.argsort)
>>> f = lambda obj: obj.iloc[obj.loc[::-1, 'arg-sort'], :]
>>> df.groupby(level='A', group_keys=False).apply(f)
                C      D  E  arg-sort
A    B                               
bar  three -0.137 -0.148  1         1
     one   -0.347  1.529  1         0
flux three  1.052 -1.317  2         1
     six    0.796 -1.610  1         0
foo  five   0.907  0.718  0         1
     two    0.542 -0.373  2         2
     two    0.330  1.068  1         0
     one   -0.153 -0.043  2         3

在尊重其索引结构的同时对多索引进行排序

1 个答案: