有一个DataFrame,其中MultipleIndex为列。
我知道当我只想选择一个列名和级别名称时,我可以使用.xs()
,如下面的代码。
df.xs('column_name1', level='column_level1', axis=1)
在我的具体情况下,我想选择多个列名,如下面的代码。 (实际上它不起作用,因为.xs不支持这种方式。)
df.xs(['column_name1', 'column_name2'], level='column_level1', axis=1)
如何在特定的一个级别中选择多个列名?
我展示了更具体的代码。
import pandas as pd
import io
data = u"""
column_name1,column_name2,column_name3
column_nameA,column_nameB,column_nameC
0.1,1,10
0.2,2,20
0.3,3,30
"""
df = pd.read_csv(io.StringIO(data), header=[0, 1])
df.columns.names = ['column_level1', 'column_level2']
print df
df
就是这个
column_level1 column_name1 column_name2 column_name3
column_level2 column_nameA column_nameB column_nameC
0 0.1 1 10
1 0.2 2 20
2 0.3 3 30
并且,我想按列名
制作这些数据column_level1 column_name1 column_name2
column_level2 column_nameA column_nameB
0 0.1 1
1 0.2 2
2 0.3 3
答案 0 :(得分:0)
IIUC您可以将loc
与slice
docs:
In [58]: df
Out[58]:
first bar baz foo qux
second one two one two one two one two
0 -0.313815 -0.160567 -0.028432 -1.169930 1.043274 0.353722 -0.912303 -1.041827
1 -0.317570 -0.452766 0.950578 0.467092 -1.960936 1.700110 0.003934 0.989709
2 0.091249 2.406773 1.848771 -1.275288 0.740245 0.657444 -1.157392 -0.103663
In [59]: df.loc[:, (['bar', 'baz'], slice(None))]
Out[59]:
first bar baz
second one two one two
0 -0.313815 -0.160567 -0.028432 -1.169930
1 -0.317570 -0.452766 0.950578 0.467092
2 0.091249 2.406773 1.848771 -1.275288
第二级:
In [68]: df.loc[:, (slice(None), ['one', 'two'])]
Out[68]:
first bar baz foo qux
second one two one two one two one two
0 -0.313815 -0.160567 -0.028432 -1.169930 1.043274 0.353722 -0.912303 -1.041827
1 -0.317570 -0.452766 0.950578 0.467092 -1.960936 1.700110 0.003934 0.989709
2 0.091249 2.406773 1.848771 -1.275288 0.740245 0.657444 -1.157392 -0.103663
修改强>
对于您的数据框:
In [75]: df.loc[:, (slice(None), ['column_nameA', 'column_nameB'])]
Out[75]:
column_level1 column_name1 column_name2
column_level2 column_nameA column_nameB
0 0.1 1
1 0.2 2
2 0.3 3
In [77]: df.loc[:, (['column_name1', 'column_name2'], slice(None))]
Out[77]:
column_level1 column_name1 column_name2
column_level2 column_nameA column_nameB
0 0.1 1
1 0.2 2
2 0.3 3
答案 1 :(得分:0)
您可以尝试select
:
print df.select(lambda x: x[0] in ['column_name1','column_name2'], axis=1)
column_level1 column_name1 column_name2
column_level2 column_nameA column_nameB
0 0.1 1
1 0.2 2
2 0.3 3
print df.loc[:, df.columns.get_level_values('column_level1')
.isin(['column_name1','column_name2'])]
column_level1 column_name1 column_name2
column_level2 column_nameA column_nameB
0 0.1 1
1 0.2 2
2 0.3 3