全面披露-我是新手,所以请耐心等待我。 我有一个数据文件。我需要首先按zip_code列进行排序-然后我需要计算每个邮政编码的最高分。
Fname Lname Area Score
Amy Doe 3 245
Jon Doe 1 310
Jane Doe 2 724
Brian Doe 1 840
Gary Doe 3 632
Jen Doe 2 854
Jim Doe 3 132
Rick Doe 1 445
import pandas as pd
from pandas import DataFrame, pandas as pd
file = pd.read_csv('test.dat',delimiter=',' )
df = DataFrame(file, columns=['Fname','Lname','Score','zip_code'])
df.sort_values(by=['Area','Score'], inplace=True)
print(df)
Fname Lname Area Score
Brian Doe 1 840-->Winner!
Rick Doe 1 445
Jon Doe 1 310
Jen Doe 2 854-->Winner!
Jane Doe 2 132
Gary Doe 3 632-->Winner!
Jim Doe 3 132
Rick Doe 3 445
Fname Lname Score Area
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
我还没有弄清楚如何总结该专栏。 你能告诉我我在做什么错吗?
答案 0 :(得分:0)
尝试groupby().idxmax()
:
df.loc[df.groupby('zip_code').Score.idxmax()]
输出:
First Last Score zip_code
0 Amy Smith 56 32003
1 Brian Smith 90 32025
6 Kelly Jones 20 32080
2 Joe Doe 90 32084