Question

我用Google搜索并搜索了Stack，但找不到这个简单问题的答案：

假设我有一个pandas多索引数据框，如下所示：

Foo  0    0.021362
     1    0.917947
     2   -0.956313
     3    0.834556
     4   -0.387533
Bar  0   -0.242659
     1    0.398657
     2    0.455909
     3    0.200061
     4   -1.273537
Baz  0    0.747849
     1   -0.012899
     2    1.026659
     3   -0.256648
     4    0.799381

如何将输出限制为每个二级索引的前N行，如下所示（如果N为2）：

Foo  0    0.021362
     1    0.917947
Bar  0   -0.242659
     1    0.398657
Baz  0    0.747849
     1   -0.012899

迄今为止，对iloc，loc，slice，sliceindex和ix的所有尝试都失败了。如果已经发布，请提供帮助和道歉。

Answer 1

在level=0上致电groupby（在第一个索引级别进行分组）并致电head(2)以获取每个组的前两行：

In [13]:
df.groupby(level=0).head(2)

Out[13]:
                    val
index1 index2          
Foo    0       0.021362
       1       0.917947
Bar    0      -0.242659
       1       0.398657
Baz    0       0.747849
       1      -0.012899

可以使用loc进行切片，但索引必须为sorted first：

In [25]:
idx = pd.IndexSlice
df.sort_index().loc[idx[:,0:1],:]

Out[25]:
                    val
index1 index2          
Bar    0      -0.242659
       1       0.398657
Baz    0       0.747849
       1      -0.012899
Foo    0       0.021362
       1       0.917947

如果没有致电sort_index，则会提出KeyError：

KeyError：'MultiIndex Slicing要求索引完全被lexsorted 元组len（2），lexsort depth（0）'

pandas：返回数据帧的每个二级索引的前N行

1 个答案: