让我们说我有这样的df,需要对链接进行分组, 如果链接重复3次以上,则应增加其值
name links
A https://a.com/-pg0
B https://b.com/-pg0
C https://c.com/-pg0
D https://c.com/-pg0
x https://c.com/-pg0
y https://c.com/-pg0
z https://c.com/-pg0
E https://e.com/-pg0
F https://e.com/-pg0
预期的输出,这里的名称为C,D,x,y,z,重复3个以上,因此前3个将为零,下一个将递增
name links
A https://a.com/-pg0
B https://b.com/-pg0
C https://c.com/-pg0
D https://c.com/-pg0
x https://c.com/-pg0
y https://c.com/-pg1
z https://c.com/-pg1
E https://e.com/-pg0
F https://e.com/-pg0
答案 0 :(得分:3)
您可以尝试将cumcount
与//
s = df.groupby('links').cumcount()//3
Out[125]:
0 0
1 0
2 0
3 0
4 0
5 1
6 1
7 0
8 0
dtype: int64
df['links'] = df['links'] + s.astype(str)