我有一个看起来像这样的DataFrame "rewrites": [ {
"source": "**",
"destination": "/index.html"
} ],
输入:
df = pd.DataFrame({'col1': ["a","b","c","d","e", "f","g","h"], 'col2': [1,1,1,2,2,3,3,3]})
我想从“ col2”分组中删除最后一行,这看起来像是...
预期输出:
col1 col2
0 a 1
1 b 1
2 c 1
3 d 2
4 e 2
5 f 3
6 g 3
7 h 3
我写了 col1 col2
0 a 1
1 b 1
3 d 2
5 f 3
6 g 3
,这让我想删除什么,但是当我尝试写df.groupby('col2').tail(1)
时,出现了轴错误。有什么解决办法
答案 0 :(得分:2)
看起来duplicated
可以工作:
df[df.duplicated('col2', keep='last') |
(~df.duplicated('col2', keep=False)) # this is to keep all single-row groups
]
或者使用您的方法,应该删除索引:
# this would also drop all single-row groups
df.drop(df.groupby('col2').tail(1).index)
输出:
col1 col2
0 a 1
1 b 1
3 d 2
5 f 3
6 g 3
答案 1 :(得分:1)
尝试一下:
df.groupby('col2', as_index=False).apply(lambda x: x.iloc[:-1,:]).reset_index(drop=True)