输入:
import pandas as pd
data = [['Delhi', 'A', 10], ['Delhi', 'B', 12], ['Delhi', 'C', 9], ['Delhi', 'D', 11], ['Mumbai', 'A', 21], ['Mumbai', 'B', 13], ['Mumbai', 'C', 19], ['Mumbai', 'D', 23]]
df = pd.DataFrame(data, columns = ['Name', 'Group', 'Val'])
df
Out[4]:
Name Group Val
0 Delhi A 10
1 Delhi B 12
2 Delhi C 9
3 Delhi D 11
4 Mumbai A 21
5 Mumbai B 13
6 Mumbai C 19
7 Mumbai D 23
我想对数据进行分组,但保留原始df的索引
分组代码:如预期的那样,它会重置索引,但是我想要df中的索引
df.groupby('Name')['Val'].max().reset_index()
Out[8]:
Name Val
0 Delhi 12
1 Mumbai 23
预期输出:
Name Val
1 Delhi 12
7 Mumbai 23
答案 0 :(得分:3)
尝试一下:
df.loc[df.groupby('Name').Val.idxmax(),['Name','Val']]
Name Val
1 Delhi 12
7 Mumbai 23
答案 1 :(得分:2)
要获取原始索引,可以先进行转换来分组。这里的代码:
idx = df.groupby(['Name'])['Val'].transform(max) == df['Val']
df[idx]
输出:
Name Group Val
1 Delhi B 12
7 Mumbai D 23