我有以下数据框:
help <- data.frame(id = c(5,5,5,5,5,12,12,12,17,17,20,20,20,20,20,20),
number = c(1,2,3,4,5,1,2,3,1,2,1,2,3,4,5,6),
episode = c(1,1,1,2,2,1,1,1,1,1,1,1,1,2,2,3))
id number episode
1 5 1 1
2 5 2 1
3 5 3 1
4 5 4 2
5 5 5 2
6 12 1 1
7 12 2 1
8 12 3 1
9 17 1 1
10 17 2 1
11 20 1 1
12 20 2 1
13 20 3 1
14 20 4 2
15 20 5 2
16 20 6 3
我为每个id都有一些观察变量number
,但每个剧集中也需要一个唯一的计数。
我希望df看起来像
id number episode number.ep
1 5 1 1 1
2 5 2 1 2
3 5 3 1 3
4 5 4 2 1
5 5 5 2 2
6 12 1 1 1
7 12 2 1 2
8 12 3 1 3
9 17 1 1 1
10 17 2 1 2
11 20 1 1 1
12 20 2 1 2
13 20 3 1 3
14 20 4 2 1
15 20 5 2 2
16 20 6 3 1
使用group_by(id)后,我在mutate命令中遇到错误。有什么建议?
答案 0 :(得分:7)
使用dplyr:
library(dplyr)
help %>% group_by(id, episode) %>%
mutate(number.ep = row_number( ))
答案 1 :(得分:3)
这可能是一个选项
library(dplyr)
help %>% group_by(id, episode) %>%
mutate(number.ep = seq(1, length(episode), by = 1))
# id number episode number.ep
#1 5 1 1 1
#2 5 2 1 2
#3 5 3 1 3
#4 5 4 2 1
#5 5 5 2 2
#6 12 1 1 1
#7 12 2 1 2
#8 12 3 1 3
#9 17 1 1 1
#10 17 2 1 2
#11 20 1 1 1
#12 20 2 1 2
#13 20 3 1 3
#14 20 4 2 1
#15 20 5 2 2
#16 20 6 3 1
使用data.table
相当于
library(data.table)
setDT(help)[, number.ep := seq(.N), by = .(id, episode)]
#> help
# id number episode number.ep
# 1: 5 1 1 1
# 2: 5 2 1 2
# 3: 5 3 1 3
# 4: 5 4 2 1
# 5: 5 5 2 2
# 6: 12 1 1 1
# 7: 12 2 1 2
# 8: 12 3 1 3
# 9: 17 1 1 1
#10: 17 2 1 2
#11: 20 1 1 1
#12: 20 2 1 2
#13: 20 3 1 3
#14: 20 4 2 1
#15: 20 5 2 2
#16: 20 6 3 1