Question

我有一个包含多个级别的数据框，例如：

idx = pd.MultiIndex.from_product((['foo', 'bar'], ['one', 'five', 'three' 'four']),
                                 names=['first', 'second'])
df = pd.DataFrame({'A': [np.nan, 12, np.nan, 11, 16, 12, 11, np.nan]}, index=idx).dropna().astype(int)

              A     
first second
foo   five     12
      four     11
bar   one      16
      five     12
      three    11

我想使用标题为second的索引级别创建一个新列，以便我得到

              A    B  
first second
foo   five     12   five
      four     11   four
bar   one      16   one
      five     12   five
      three    11   three

我可以通过重置索引，复制列，然后重新应用来实现这一点，但这似乎更为圆润。

我尝试了df.index.levels[1]，但是创建了一个排序列表，它不保留顺序。

如果它是单个索引，我会使用df.index但是在创建一列元组的多索引中。

如果在其他地方解决了这个问题，请分享，因为我没有运气搜索stackoverflow档案。

Answer 1

df['B'] = df.index.get_level_values(level=1)  # Zero based indexing.
# df['B'] = df.index.get_level_values(level='second')  # This also works.
>>> df
               A      B
first second           
foo   one     12    one
      two     11    two
bar   one     16    one
      two     12    two
      three   11  three

Answer 2

df['B'] = idx.to_series().str[1]

Answer 3

如果您想使用索引名称（而不是数字索引）来获取索引列值，我可以借用@ AlbertoGarcia-Raboso的答案。

请注意，这将为您提供一个仍包含索引列的输出，该输出是一系列问题所要求的。首先看起来像是重复的一列。

df.index.to_frame()['second']

（然后例如用df.index.to_frame()['second'][8]询问第9个系列项目）

熊猫：获得系列多指数水平

3 个答案: