我有以下数据示例
df<-data.frame(ID=c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3), CODE=c("A","A","A","A","A","A","B","B","B","B","B","B","B","B","B","A","A","A","A","A","A","A","A","A","B","B","B","B","B","B"),
DAT_NUM=c(20180101,20180101,20180105,20180107,20180107,20180108,20180203,20180203,20180201,20180205,
20180501,20180501,20180505,20180507,20180425,20180408,20180403,20180403,20180401,20180405,
20180105,20180105,20180105,20180107,20180107,20180110,20180206,20180203,20180201,20180205))
我需要一个新列(测试),该列基于DAT_NUM(例如1-6)分配一个连续的日期值,但是必须根据ID和CODE列的唯一组合来重置序列。测试列中的第1天指的是按顺序排列的1A,1B,2B,2A等的第一天,但对DAT_NUM中潜在天数的长度没有限制。
所需的输出:
df1<-data.frame(ID=c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3),
CODE=c("A","A","A","A","A","A","B","B","B","B","B","B","B","B","B","A","A","A","A","A","A","A","A","A","B","B","B","B","B","B"),
DAT_NUM=c(20180101,20180101,20180105,20180107,20180107,20180108,20180203,20180203,20180201,20180205,
20180501,20180501,20180505,20180507,20180425,20180408,20180403,20180403,20180401,20180405,
20180105,20180105,20180105,20180107,20180107,20180110,20180206,20180203,20180201,20180205),
test=c(1,1,2,3,3,4,2,2,1,3,
2,2,3,4,1,4,2,2,1,3,
1,1,1,2,1,2,6,4,3,5))
答案 0 :(得分:3)
按“ ID”和“ CODE”分组后,我们可以使用match
library(dplyr)
df %>%
group_by(ID, CODE) %>%
mutate(test = match(DAT_NUM, sort(unique(DAT_NUM))))
# A tibble: 30 x 4
# Groups: ID, CODE [6]
# ID CODE DAT_NUM test
# <dbl> <fct> <dbl> <int>
# 1 1 A 20180101 1
# 2 1 A 20180101 1
# 3 1 A 20180105 2
# 4 1 A 20180107 3
# 5 1 A 20180107 3
# 6 1 A 20180108 4
# 7 1 B 20180203 2
# 8 1 B 20180203 2
# 9 1 B 20180201 1
#10 1 B 20180205 3
# … with 20 more rows