我想计算每年的最大值,并显示部门和该值。例如,从屏幕截图中,我想显示: 2010年:电信781 2011:技术973
我尝试使用: df.groupby(['Year','Sector'])['Revenue']。max()
答案 0 :(得分:2)
尝试使用idxmax
和loc
:
df.loc[df.groupby(['Sector','Year'])['Revenue'].idxmax()]
MVCE:
import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame({'Sector':['Telecom','Tech','Financial Service','Construction','Heath Care']*3,
'Year':[2010,2011,2012,2013,2014]*3,
'Revenue':np.random.randint(101,999,15)})
df.loc[df.groupby(['Sector','Year'])['Revenue'].idxmax()]
输出:
Sector Year Revenue
3 Construction 2013 423
12 Financial Service 2012 838
9 Heath Care 2014 224
1 Tech 2011 466
5 Telecom 2010 843
答案 1 :(得分:2)
也.sort_values
+ .tail
,仅按年份分组。来自@Scott Boston的数据
df.sort_values('Revenue').groupby('Year').tail(1)
输出:
Sector Year Revenue
9 Heath Care 2014 224
3 Construction 2013 423
1 Tech 2011 466
12 Financial Service 2012 838
5 Telecom 2010 843