我想索引我的数据框,以便在每个组中它从0开始到该组中的观察数。例如:
pd.DataFrame([["John","Car"],["John","House"],["Sam","Skate"],["Sam","Disco"],["Sam","Space"]])
我想要:
pd.DataFrame([["John","Car",0],["John","House",1],["Sam","Skate",0],["Sam","Disco",1],["Sam","Space",2]])
谢谢
答案 0 :(得分:1)
使用:
df.groupby(0)[0].apply(lambda x:x.duplicated().cumsum())
答案 1 :(得分:1)
您正在寻找累积计数功能:
df = pd.DataFrame([["John","Car"],["John","House"],["Sam","Skate"],["Sam","Disco"],["Sam","Space"]])
df.groupby(0).cumcount()