Pandas - select rows with best values

时间:2016-07-11 22:13:33

标签: python pandas dataframe

I have this dataframe

   col1 col2 col3
0     2    A 1
1     1    A 100
2     3    B 12
3     4    B 2

I want to select the highest col1 value from all with A, then the one from all with B, etc, i.e. this is the desired output

   col1 col2  col3
0     2    A   1
3     4    B   2

I know I need some kind of groupby('col2'), but I don't know what to use after that.

2 个答案:

答案 0 :(得分:3)

is that what you want?

In [16]: df.groupby('col2').max().reset_index()
Out[16]:
  col2  col1
0    A     2
1    B     4

答案 1 :(得分:2)

使用groupby('col2')然后使用idxmax获取每个组中最大值的索引。最后,使用这些索引值对原始数据帧进行切片。

df.loc[df.groupby('col2').col1.idxmax()]

enter image description here

请注意,原始数据框的索引值将被保留。