Question

这是我从groupby获得的multiIndex数据框，其中我有2个索引[＆＃39; YearMonth＆＃39;，＆＃39; product_id＆＃39;]和列名[＆＃39; count＆＃39;] I＆＃ 39;已经尝试了documentation和其他stackoverflow建议的示例，但仍然无法为每个YearMonth索引列索引 product_id == 6818。

df = df.groupby(['YearMonth','product_id'])[['count']].sum()
df.head(5)
Out[54]:

                      count
YearMonth   product_id  
2017-05-01  6818    3
7394    1   7394    1
8369    1   8369    1
8504    1   8504    1
8666    1   8666    1



In [55]:
df.columns
Out[55]:
Index(['count'], dtype='object')
In [56]:
df.index.names
Out[56]:
FrozenList(['YearMonth', 'product_id'])
In [59]:
df.loc[('2017-05-01',0),'count']

我已尝试过：简单索引df['YearMonth']但它只适用于不是索引的列

df.loc \ ix \ iloc ，如stackoverflow question

中所示

df.loc[('2017-05-01',0)]

我总是得到KeyError: ('2017-05-01', 0)，KeyError: 'YearMonth'

等KeyError

以及我尝试取消堆栈方法df.unstack(level=0)并执行与上面所述相同的操作

有人可以解释我错过了什么吗？提前致谢

Answer 1

您的样本DF看起来并不健康＆＃34; - 我已修好它，所以现在看起来如下：

In [121]: df
Out[121]:
                       count
YearMonth  product_id
2017-05-01 6818            3
           7394            1
           8369            1
           8504            1
           8666            1

选项1：

In [122]: df.loc[pd.IndexSlice[:, 6818], :]
Out[122]:
                       count
YearMonth  product_id
2017-05-01 6818            3

选项2：适用于命名索引

In [145]: df.query("product_id in [6818]")
Out[145]:
                       count
YearMonth  product_id
2017-05-01 6818            3

选项3：

In [146]: df.loc[(slice(None), 6818), :]
Out[146]:
                       count
YearMonth  product_id
2017-05-01 6818            3

使用MultiIndex在轴上进行基本索引

1 个答案: