Question

我有以下代码：

import pandas as pd
import numpy as np
df = pd.DataFrame({'clif_cod' : [1,2,3,3,4,4,4],
               'peds_val_fat' : [10.2, 15.2, 30.9, 14.8, 10.99, 39.9, 54.9],
               'mes' : [1,2,4,5,5,6,12],
               'ano' : [2016, 2016, 2016, 2016, 2016, 2016, 2016]})

vetor_valores = df.groupby(['mes','clif_cod']).sum()

产生了这个输出：

               ano         peds_val_fat
mes clif_cod                    
1   1         2016         10.20
2   2         2016         15.20
4   3         2016         30.90
5   3         2016         14.80
    4         2016         10.99
6   4         2016         39.90
12  4         2016         54.90

如何根据mes和clif_cod选择行？

当我做列表（df）时，我只得到ano和peds_val_fat。

Answer 1

使用pd.IndexSlice

vetor_valores.loc[[pd.IndexSlice[1,1]],:]
Out[272]: 
               ano  peds_val_fat
mes clif_cod                    
1   1         2016          10.2

Answer 2

IIUC，您可以将参数as_index=False传递给您的groupby。然后，您可以像访问任何其他数据框一样访问它

vetor_valores = df.groupby(['mes','clif_cod'], as_index=False).sum()

>>> vetor_valores
   mes  clif_cod   ano  peds_val_fat
0    1         1  2016         10.20
1    2         2  2016         15.20
2    4         3  2016         30.90
3    5         3  2016         14.80
4    5         4  2016         10.99
5    6         4  2016         39.90
6   12         4  2016         54.90

要访问值，您现在可以像使用任何数据帧一样使用iloc或loc：

# Select first row:
vetor_valores.iloc[0]
...

或者，如果您已经创建了groupby并且不想返回并重新创建它，则可以重置索引，结果相同。

vetor_valores.reset_index()

Answer 3

你有一个带有两级MultiIndex的数据帧。使用这两个值来访问行，例如vetor_valores.loc[(4,3)]。

Answer 4

在axis中使用.loc参数：

vetor_valores.loc(axis=0)[1,:]

输出：

               ano  peds_val_fat
mes clif_cod                    
1   1         2016          10.2

由groupby生成的pandas数据帧中的列名

4 个答案: