Question

我有一个看起来像这样的DataFrame "rewrites": [ { "source": "**", "destination": "/index.html" } ],

输入：

df = pd.DataFrame({'col1': ["a","b","c","d","e", "f","g","h"], 'col2': [1,1,1,2,2,3,3,3]})

我想从“ col2”分组中删除最后一行，这看起来像是...

预期输出：

我写了col1 col2 0 a 1 1 b 1 3 d 2 5 f 3 6 g 3，这让我想删除什么，但是当我尝试写df.groupby('col2').tail(1)时，出现了轴错误。有什么解决办法

Answer 1

看起来duplicated可以工作：

df[df.duplicated('col2', keep='last') | 
   (~df.duplicated('col2', keep=False))  # this is to keep all single-row groups
  ]

或者使用您的方法，应该删除索引：

# this would also drop all single-row groups
df.drop(df.groupby('col2').tail(1).index)

输出：

  col1  col2
0    a     1
1    b     1
3    d     2
5    f     3
6    g     3

Answer 2

尝试一下：

df.groupby('col2', as_index=False).apply(lambda x: x.iloc[:-1,:]).reset_index(drop=True)