将大熊猫分组为最大值后,如何完全沿着max()值显示匹配的行结果
df = data.loc[:,['no','std_date','result_date','result']]
result = df.groupby(['no','std_date'])[['result_date','result']].max()
我在每个“ std_date”上显示结果并发送结果。 但是现在'result_date'和'result'每列的最大值。enter image description here
<original data >
no std_date result_date result
0 s1 2012-12-24 2012-07-03 601.19
1 s2 2011-11-23 2011-05-16 1.96
2 s2 2011-11-23 2011-11-10 15.2
3 s3 2016-08-10 2016-04-23 14.05
4 s5 2013-11-12 2013-05-31 34.73
5 s5 2013-11-12 2013-07-22 12
6 s5 2017-11-14 2014-05-09 0.59
7 s5 2017-11-14 2016-09-07 0.8
8 s5 2017-11-14 2017-06-20 1.54
9 s6 2012-06-01 2012-05-14 1.08
10 s7 2018-11-19 2018-04-10 25.74
11 s7 2018-11-19 2018-10-16 12.49
12 s8 2013-12-24 2013-10-17 26.48
13 s8 2016-01-20 2013-12-31 1.36
14 s8 2016-01-20 2014-08-27 0.9
15 s8 2016-01-20 2015-06-29 3.72
<I want to >
result_date result
no std_date
s1 2012-12-24 2012-11-28 601.19
s2 2011-11-23 2011-11-10 15.2
s3 2016-08-10 2016-04-23 14.05
s5 2013-11-12 2013-07-22 12
s5 2017-11-14 2017-06-20 1.54
s6 2012-06-01 2012-05-14 1.08
s7 2018-11-19 2018-10-16 12.49
s8 2013-12-24 2013-10-17 26.48
s8 2016-01-20 2015-06-29 3.72
答案 0 :(得分:1)
您可以保留结果,然后将其合并回DataFrame
result = df.groupby(['no','std_date'])[['result_date']]\
.max()\
.reset_index()\
.merge(df,on=['no','std_date','result_date'])
输出:
no std_date result_date result
0 s1 12-12-24 12-07-03 601.19
1 s2 11-11-23 11-11-10 15.20
2 s3 16-08-10 16-04-23 14.05
3 s5 13-11-12 13-07-22 12.00
4 s5 17-11-14 17-06-20 1.54
5 s6 12-06-01 12-05-14 1.08
6 s7 18-11-19 18-10-16 12.49
7 s8 13-12-24 13-10-17 26.48
8 s8 16-01-20 15-06-29 3.72
答案 1 :(得分:0)
您也可以使用 .idxmax()
,它的计算成本比合并要低。
df.loc[df.groupby(['no', 'std_date'])['result_date'].idxmax()]