我有一些分组数据,如果该组中的行数少于3,我想删除某些组。数据如下:
ID,year,age
810006862,2000,49
810006862,2001,
810006862,2002,
810006862,2003,52
810023112,2003,27
810023112,2004,28
810023112,2005,29
810023112,2006,30
810033622,2000,24
810033622,2001,25
我尝试了以下代码:
df = pd.read_csv('sample.csv')
groups = df.groupby(by=['ID'])
print(groups.apply(lambda g: g[2 < g['age'].cumcount()]))
但是我收到一条错误消息:
AttributeError: 'Series' object has no attribute 'cumcount'
有人可以帮忙吗?提前致谢。预期结果将删除最后一组,如下所示:
ID,year,age
810006862,2000,49
810006862,2001,
810006862,2002,
810006862,2003,52
810023112,2003,27
810023112,2004,28
810023112,2005,29
810023112,2006,30