在熊猫中具有多重索引的groupby上平均

时间:2018-09-24 14:48:01

标签: python pandas aggregate pandas-groupby

我有一个名为GradeGroup的Groupby对象。这是一个多索引groupby,按“等级”,然后按“ HeatNumber”。数据框中有一个“ Ontime”列,我正在显示该列的“最大值”:

Grade
    150HP                    44.0
    A100C                    41.1
    A100X                    50.7
    LOWO2A100                42.7

我如何获得每个等级的“ Ontime”最大值的平均值,因此150HP等级的“ Ontime”平均值(最大值)为(45.8 + 45.3 + 35.6 + 46.0 + 50.0 + 46.1 + 39.5) / 7或44.0。我正在寻找这样的东西:

select [111,222] A,  [111,222] B, [111,333] C, A=B ab, A=C ac

2 个答案:

答案 0 :(得分:1)

再次使用max,但要使用level参数

GradeGroup.Ontime.max().mean(level=0)

答案 1 :(得分:0)

您可以使用groupby()agg()mean()

df.groupby(['Grade','HeatNumber']).agg({'Ontime': 'max'}).mean(level=0)

这是一个有效的示例:

df = pd.DataFrame({'Grade': ['150HP', '150HP', '150HP', 'A100C', 'A100C', 'A100X', 'A100X', 'A100X', 'LOWO2A100'], 
                   'HeatNumber': ['19258', '19258', '19260','19187', '19787', '19261', '19261', '19237', '19262'],
                   'Ontime': [45.8,  39.5, 42.8, 31.6, 65.5, 25.4, 65.1, 21.5, 32.4]})

礼物:

       Grade HeatNumber  Ontime
0      150HP      19258    45.8
1      150HP      19258    39.5
2      150HP      19260    42.8
3      A100C      19187    31.6
4      A100C      19787    65.5
5      A100X      19261    25.4
6      A100X      19261    65.1
7      A100X      19237    21.5
8  LOWO2A100      19262    32.4

应用上面的行:

           Ontime
Grade            
150HP       44.30
A100C       48.55
A100X       43.30
LOWO2A100   32.40