我有一个csv文件,其中包含诸如StateName,Population,CityName等列。请注意,对于每个州,您可以有多个城市名称,因此同一城市有多个人口
我想要拥有的是将StateName与同一城市的最高三个人口分组。
what i have: (image click to see)
what i want to have (image click to see) 我的代码是:
def answer_six():
x=census_df['STNAME'].unique()
census_df2 = df = pd.DataFrame()
for a in x :
census_dfcopy = census_df.copy()
census_dfcopy = census_dfcopy.set_index(['STNAME'])
census_dfcopy = census_dfcopy.loc[a]
census_dfcopy = census_dfcopy.reset_index()
census_dfcopy = census_dfcopy.set_index(['CENSUS2010POP'])
census_dfcopy1=census_dfcopy.sort_index(ascending = False)
census_dfcopy1= census_dfcopy1.append(census_dfcopy1)
census_dfcopy1.groupby('STNAME')
return census_dfcopy1.head(3)
answer_six()
我只获得最后一个状态的最后3个值。
要下载csv文件,请访问链接: https://drive.google.com/open?id=1ptE6MRQ1NGrfRYBB7NKjqhOJZXlxScPo
答案 0 :(得分:0)
你可以做
census_df.groupby('STNAME').CENSUS2010POP.nlargest(3)
实际情况:
In [51]: df
Out[51]:
ctyname pop stname
0 0 10 a
1 1 9 a
2 2 1 a
3 3 3 a
4 4 12 b
5 5 12 b
6 6 13 b
7 7 14 b
8 8 4 c
9 9 3 c
10 10 2 c
11 11 1 c
In [68]: df.groupby('stname').pop.nlargest(3)
Out[68]:
stname
a 0 10
1 9
3 3
b 7 14
6 13
4 12
c 8 4
9 3
10 2