如果存在多索引,熊猫将不允许选择列?

时间:2019-11-20 13:47:32

标签: python pandas

我正在调试一些熊猫代码,这些代码意外地创建了MultiIndex而不是常规索引。由于索引多,Pandas不允许选择列。在这种情况下,我可以摆脱MultiIndex,但是如果确实需要该MultiIndex,那么如何选择列?   附加信息-我在pandas 0.25.1中遇到此错误,但是此代码在几年前某人写的笔记本中,因此显然它曾经用于较旧版本?

import numpy as np
import pandas as pd

names = ['FirstColumn', 'SecondColumn']
data = np.array([[5,6],[7,8]])
df = pd.DataFrame(data, columns = [names]) #Bug: this "works" but isn't what you want.
#The brackets around "[names]" creates a multi-index but that was unintentional.
#But "df.head()" and "df.describe()" both look normal so you can't see anything is wrong. 

df['FirstColumn'] #ERROR! works fine with a single index, but fails with multiindex
df.FirstColumn #ERROR! works fine with a single index, but fails with multiindex
df.loc[:,'FirstColumn'] #ERROR! works fine with a single index, but fails with multiindex

这两个陈述均对only integer scalar arrays can be converted to a scalar index产生误导性错误 那么,在有多索引的情况下如何选择列?我知道一些技巧,例如unstack或更改索引等;但似乎应该有一种简单的方法?

更新:原来这在pandas 0.22.0中工作正常,但在0.25.1中失败。看起来引入了回归错误。我已经在熊猫github上报道了它。

1 个答案:

答案 0 :(得分:0)

使用DataFrame.xs函数:

print (df.xs('FirstColumn', axis=1, level=0))
  FirstColumn
0           5
1           7