在此数据框中:
region area other
alabama 99151.5 0.564506436
alabama 99151.5 0.193809515
arkansas 165927 0.878569179
arkansas 165927 0.00946268
arkansas 165927 0.075263353
colorado 408747 0.62052038
colorado 408747 0.723038731
georgia 117363 0.970624899
georgia 117363 0.534441671
idaho 198303 0.378282313
idaho 198303 0.836349349
我想按区域保留2个顶部区域,但是我不能使用pandas nlargest命令,因为它不允许我在区域列中保留重复项。我该怎么做?
- 编辑:
预期产出:
region area other
colorado 408747 0.62052038
colorado 408747 0.723038731
idaho 198303 0.378282313
idaho 198303 0.836349349
答案 0 :(得分:3)
在sort_values
groupby
head
df.sort_values(['area','other']).groupby('area').head(2)