我有一个数据框
id player
8297682400 Player1
8297692740 Player1
8255798760 Player1
8255798760 Player1
8255798760 Player1
8255799456 Player2
8255799456 Player2
8255799456 Player2
8255866000 Player2
8255866000 Player2
8255866000 Player2
8255826600 Player1
8255826600 Player1
8255826600 Player1
8255854600 Player2
8255854700 Player1
如果我使用group_by(player,id)
,我知道我可以很容易地按%>% mutate(counter=1:n())
来计算每组中的行数
但是我该如何计算每个玩家的唯一id
值,并在发现重复项时“暂停”计数?
我想要:
id player id_counter
8297682400 Player1 1
8297692740 Player1 2
8255798760 Player1 3
8255798760 Player1 3
8255798760 Player1 3
8255799456 Player2 1
8255799456 Player2 1
8255799456 Player2 1
8255866000 Player2 2
8255866000 Player2 2
8255866000 Player2 2
8255826600 Player1 4
8255826600 Player1 4
8255826600 Player1 4
8255854600 Player2 3
8255854700 Player1 5
答案 0 :(得分:4)
我们可以使用match
df1 %>%
group_by(player) %>%
mutate(id_counter = match(id, unique(id)))
# A tibble: 16 x 3
# Groups: player [2]
# id player id_counter
# <dbl> <chr> <int>
# 1 8297682400 Player1 1
# 2 8297692740 Player1 2
# 3 8255798760 Player1 3
# 4 8255798760 Player1 3
# 5 8255798760 Player1 3
# 6 8255799456 Player2 1
# 7 8255799456 Player2 1
# 8 8255799456 Player2 1
# 9 8255866000 Player2 2
#10 8255866000 Player2 2
#11 8255866000 Player2 2
#12 8255826600 Player1 4
#13 8255826600 Player1 4
#14 8255826600 Player1 4
#15 8255854600 Player2 3
#16 8255854700 Player1 5
或者通过转换为factor
并将其强制为integer
df1 %>%
group_by(player) %>%
mutate(id_counter = as.integer(factor(id, levels = unique(id))))