我有以下数据框,
Input Dataframe
class section sub marks school city
0 I A Eng 80 jghss salem
1 I A Mat 90 jghss salem
2 I A Eng 50 Nan salem
3 III A Eng 80 gphss Nan
4 III A Mat 45 Nan salem
5 III A Eng 40 gphss Nan
6 III A Eng 20 gphss salem
7 III A Mat 55 gphss Nan
现在,我需要通过保留其余的列值来找出基于“类”,“部分”和“子”的平均标记的最高排名。
Aggregated & Grouped Dataframe
class section sub marks school city rank
0 I A Eng 65 jghss salem 2
1 I A Mat 90 jghss salem 1
2 III A Eng 80 gphss salem 1
3 III A Mat 50 gphss salem 2
Final Outcome
class section sub marks school city rank
0 I A Mat 90 jghss salem 1
1 III A Eng 80 gphss salem 1
我尝试过,
df.groupby(['class','section','sub','school','city'])['marks'].mean()
,但无法达到最终结果