熊猫中的分组累积最大值

时间:2021-01-12 01:58:41

标签: python pandas dataframe

import pandas as pd

gender = {'Type': ["Male","Male","Male","Female","Female","Female","Male","Male","Male","Female","Female","Female","Female","Male","Male","Female","Female","Female"], 
          'Age': [40, 22, 23, 55, 75, 31, 22, 23, 26, 29, 33, 32, 40, 41, 47, 48, 54]}

df = pd.DataFrame.from_dict(gender)

print(df)
          
     Type  Age
0     Male   40
1     Male   22
2     Male   23
3   Female   55
4   Female   75
5   Female   31
6     Male   22
7     Male   23
8     Male   26
9   Female   29
10  Female   33
11  Female   32
12  Female   40
13    Male   41
14    Male   47
15  Female   48
16  Female   54


Expected Output

     Type  Age      Group_Max
0     Male   40     40
1     Male   22     40
2     Male   23     40
3   Female   55     75
4   Female   75     75
5   Female   31     75
6     Male   22     26
7     Male   23     26
8     Male   26     26
9   Female   29     40
10  Female   33     40
11  Female   32     40
12  Female   40     40
13    Male   41     47
14    Male   47     54
15  Female   48     54
16  Female   54     54

我不是在寻找以下输出,因为每次性别类型更改时我都希望 cummax 重置

df["cumsum"] = df.groupby(['Type']).Age.cummax()

      Type  Age  cumsum
0     Male   40      40
1     Male   22      40
2     Male   23      40
3   Female   55      55
4   Female   75      75
5   Female   31      75
6     Male   22      40
7     Male   23      40
8     Male   26      40
9   Female   29      75
10  Female   33      75
11  Female   32      75
12  Female   40      75
13    Male   41      41
14    Male   47      47
15  Female   48      75
16  Female   54      75

1 个答案:

答案 0 :(得分:1)

创建两个辅助列分别定位到组。

df['tag'] = df['Type'] != df['Type'].shift(1)
df['label'] = df.loc[df['tag'], 'tag'].cumsum()
df['label'] = df['label'].fillna(method='ffill')
df['Group_Max'] = df.groupby('label')['Age'].transform(max)

输出:

      Type  Age    tag  label  Group_Max
0     Male   40   True    1.0         40
1     Male   22  False    1.0         40
2     Male   23  False    1.0         40
3   Female   55   True    2.0         75
4   Female   75  False    2.0         75
5   Female   31  False    2.0         75
6     Male   22   True    3.0         26
7     Male   23  False    3.0         26
8     Male   26  False    3.0         26
9   Female   29   True    4.0         40
10  Female   33  False    4.0         40
11  Female   32  False    4.0         40
12  Female   40  False    4.0         40
13    Male   41   True    5.0         47
14    Male   47  False    5.0         47
15  Female   48   True    6.0         54
16  Female   54  False    6.0         54