与此处的问题相关:Reordering pandas dataframe based on multiple column and sum of one column
使用sort
列时,如何接受此数据框中的前2个国家/地区:
在这种情况下,前两个国家将是澳大利亚和阿富汗
Country_FAO type mean_area sort
5 Australia car 12141000.0 18910501.0
4 Australia car 6475695.0 18910501.0
6 Australia bus 293806.0 18910501.0
0 Afghanistan car 2029000.0 2141000.0
1 Afghanistan car 112000.0 2141000.0
2 Algeria bus 827000.0 829351.0
3 Algeria bus 2351.0 829351.0
- 编辑:
我还想保留type
列。在这种情况下,解决方案应如下所示:
Country_FAO type mean_area sort
5 Australia car 12141000.0 18910501.0
4 Australia car 6475695.0 18910501.0
6 Australia bus 293806.0 18910501.0
0 Afghanistan car 2029000.0 2141000.0
1 Afghanistan car 112000.0 2141000.0
答案 0 :(得分:1)
<强>更新强>
In [166]: df.loc[df.Country_FAO.isin(df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index)]
Out[166]:
Country_FAO type mean_area sort
5 Australia car 12141000.0 18910501.0
4 Australia car 6475695.0 18910501.0
6 Australia bus 293806.0 18910501.0
0 Afghanistan car 2029000.0 2141000.0
1 Afghanistan car 112000.0 2141000.0
我会这样做:
In [153]: df.groupby('Country_FAO').sum()
Out[153]:
mean_area
Country_FAO
Afghanistan 2141000.0
Algeria 829351.0
Australia 18910501.0
In [154]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area')
Out[154]:
mean_area
Country_FAO
Australia 18910501.0
Afghanistan 2141000.0
In [155]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index
Out[155]: Index(['Australia', 'Afghanistan'], dtype='object', name='Country_FAO')
另外,您可能想要重置索引:
In [156]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').reset_index()
Out[156]:
Country_FAO mean_area
0 Australia 18910501.0
1 Afghanistan 2141000.0