熊猫分组由具有原始索引的数据框保留的数据框

时间:2020-05-10 14:02:07

标签: python pandas dataframe pandas-groupby

输入:

import pandas as pd
data = [['Delhi', 'A', 10], ['Delhi', 'B', 12], ['Delhi', 'C', 9], ['Delhi', 'D', 11], ['Mumbai', 'A', 21], ['Mumbai', 'B', 13], ['Mumbai', 'C', 19], ['Mumbai', 'D', 23]]
df = pd.DataFrame(data, columns = ['Name', 'Group', 'Val']) 
df

Out[4]: 
     Name Group  Val
0   Delhi     A   10
1   Delhi     B   12
2   Delhi     C    9
3   Delhi     D   11
4  Mumbai     A   21
5  Mumbai     B   13
6  Mumbai     C   19
7  Mumbai     D   23

我想对数据进行分组,但保留原始df的索引

分组代码:如预期的那样,它会重置索引,但是我想要df中的索引

df.groupby('Name')['Val'].max().reset_index()
Out[8]: 
     Name  Val
0   Delhi   12
1  Mumbai   23

预期输出:

     Name    Val
1   Delhi    12  
7  Mumbai    23

2 个答案:

答案 0 :(得分:3)

尝试一下:

df.loc[df.groupby('Name').Val.idxmax(),['Name','Val']]

    Name    Val
1   Delhi   12
7   Mumbai  23

答案 1 :(得分:2)

要获取原始索引,可以先进行转换来分组。这里的代码:

idx = df.groupby(['Name'])['Val'].transform(max) == df['Val']
df[idx]

输出:

    Name    Group   Val
1   Delhi   B       12
7   Mumbai  D       23