Question

假设我有一个带有三个索引的熊猫数据框＆＃39; a＆＃39;，＆＃39; b＆＃39;和＆＃39; c＆＃39; - 如何从数组中添加第四个索引并将其名称设置为＆＃39; d＆＃39;在同一时间？

这有效：

df.set_index(fourth_index, append=True, inplace=True)
df.index.set_names(['a','b','c','d'], inplace=True)

但是，我正在寻找一些不需要我再次命名前三个指数的东西，例如：（这不起作用）：

df.set_index({'d': fourth_index}, append=True, inplace=True)

我在这里错过了一些功能吗？

Answer 1

将fourth_index添加为列，然后调用set_index。名称保留。

df = df.assign(d=fourth_index).set_index('d', append=True)

注意，如果你担心记忆，那你正在做的事情很好。没有必要牺牲较少角色的表现。

<强>演示

df
          a   b   c   d
l1  l2                 
bar one  24  13   8   9
    two  11  30   7  23
baz one  21  31  12  30
    two   2   5  19  24
foo one  15  18   3  16
    two   2  24  28  11
qux one  23   9   6  12
    two  29  28  11  21

df.assign(l3=1).set_index('l3', append=True)

             a   b   c   d
l1  l2  l3                
bar one 1   24  13   8   9
    two 1   11  30   7  23
baz one 1   21  31  12  30
    two 1    2   5  19  24
foo one 1   15  18   3  16
    two 1    2  24  28  11
qux one 1   23   9   6  12
    two 1   29  28  11  21

Answer 2

为什么不保存以前的值，也就是

old_names = df.index.names
df.set_index(fourth_index, append=True, inplace=True)
df.index.set_names(old_names + ['d'], inplace=True)

这将保留良好性能的优点，并且不需要您重新键入旧名称。

在pandas MultiIndex上附加一个级别

2 个答案: