Question

如果有多个命名级别，如何从DataFrame中选择特定列？

>>>  x = pd.DataFrame({'instance':['first','first','first'],'foo':['a','b','c'],'bar':rand(3)})
>>> x = x.set_index(['instance','foo']).transpose()
>>> x.columns
MultiIndex
[(u'first', u'a'), (u'first', u'b'), (u'first', u'c')]
>>> x
instance     first                    
foo              a         b         c
bar       0.102885  0.937838  0.907467

（注意：这个问题在this SO question的评论中被提出，评论中也有答案。认为将其作为一个问题本身就是好的。）

Answer 1

这正是Multiindex切片机的目的，请参阅文档here

In [15]: idx = pd.IndexSlice

In [16]: x.loc[:,idx[:,'a']]
Out[16]: 
instance     first
foo              a
bar       0.525356

In [17]: x.loc[:,idx[:,['a','c']]]
Out[17]: 
instance     first          
foo              a         c
bar       0.525356  0.418152

选择具有多个列命名级别的列

1 个答案: