进行分组后,我的新df具有3级multindex。我需要访问所有带有'ZEBRA'标签的行;包含在第3级索引中。我正在尝试使用df.loc
,但无法使用。我想遍历标签,但这必须是嵌套循环才能在下面进行;这让我感到我没有按照正确的思路进行思考,因此必须容易得多。
> indexlevel1_value1->indexlevel2_value1>indexlevel3_'stabilizer'
> indexlevel1_value1->indexlevel2_value2>indexlevel3_'stabilizer'
> indexlevel1_value1->indexlevel2_value3>indexlevel3_'stabilizer'
> ...................
> indexlevel2_value1->indexlevel2_value1>indexlevel3_'stabilizer'
这个问题看起来很接近-Selecting rows in a MultiIndex dataframe by index without losing any levels,但重点放在一级索引上。
import pandas as pd
import numpy as np
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo',
'bar', 'foo', 'bar','foo',
'bar','foo' ],
'B' : ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three',
'two', 'three','two', 'two',
'one', 'three'],
'C' : ['MR', 'ZEBRA', 'KID', 'ZEBRA',
'MOS', 'ALPHA', 'ZULU', 'ZEBRA',
'TREE','PLANT', 'JOOMLA','ZEBRA',
'MOS','ZULU'],
'D' : np.random.randn(14)})
grouped = df.groupby(['A', 'B','C'])
grouped.count()
| A | B | C | D |
|-----|-------|--------|---|
| bar | one | MOS | 1 |
| | | ZEBRA | 1 |
| | three | ZEBRA | 1 |
| | two | ALPHA | 1 |
| | | JOOMLA | 1 |
| | | TREE | 1 |
| foo | one | MR | 1 |
| | | ZULU | 1 |
| | three | PLANT | 1 |
| | | ZEBRA | 1 |
| | | ZULU | 1 |
| | two | KID | 1 |
| | | MOS | 1 |
| | | ZEBRA | 1 |
newdf= grouped.count()
newdf.loc[('bar','three','ZEBRA')]
#1
所需:
| A | B | C | D |
|-----|-------|-------|---|
| bar | one | ZEBRA | 1 |
| bar | three | ZEBRA | 1 |
| foo | three | ZEBRA | 1 |
| foo | two | ZEBRA | 1 |
答案 0 :(得分:1)
您可以这样做:
grouped[grouped.index.get_level_values(2) == 'ZEBRA'].reset_index()
A B C D
0 bar one ZEBRA 1
1 bar three ZEBRA 1
2 foo three ZEBRA 1
3 foo two ZEBRA 1
备用方式:grouped.query("C == 'ZEBRA'").reset_index()
答案 1 :(得分:1)
我喜欢在.loc中使用axis
参数:
df_out.loc(axis=0)[:, :, 'ZEBRA'].reset_index()
输出:
A B C D
0 bar one ZEBRA 1
1 bar three ZEBRA 1
2 foo three ZEBRA 1
3 foo two ZEBRA 1