我的数据集如下:
ID VISIT_ID DATE DV
1001 112233 12-23 3
1001 112233 12-23 4
1001 112244 12-23 5
1001 112244 12-23 6
1001 112244 12-23 7
1001 112244 12-23 8
1002 112254 12-23 3
1002 112254 12-23 4
1002 112254 12-23 5
1002 112264 12-23 6
1002 112264 12-23 7
1002 112264 12-23 8
我想要的结果如下;它为每个唯一的VISIT_ID分配增量遭遇值。对于每个ID,序列将从1重新开始。将非常感谢帮助。
ID VISIT_ID DATE DV ENCOUNTER
1001 112233 12-23 3 1
1001 112233 12-23 4 1
1001 112244 12-23 5 2
1001 112244 12-23 6 2
1001 112244 12-23 7 2
1001 112244 12-23 8 2
1002 112254 12-23 3 1
1002 112254 12-23 4 1
1002 112254 12-23 5 1
1002 112264 12-23 6 2
1002 112264 12-23 7 2
1002 112264 12-23 8 2
答案 0 :(得分:1)
我们可以使用match
在按“ID”分组后找到唯一'VISIT_ID'的索引
library(dplyr)
df1 %>%
group_by(ID) %>%
mutate(ENCOUNTER = match(VISIT_ID, unique(VISIT_ID)))
# ID VISIT_ID DATE DV ENCOUNTER
# <int> <int> <chr> <int> <int>
#1 1001 112233 12-23 3 1
#2 1001 112233 12-23 4 1
#3 1001 112244 12-23 5 2
#4 1001 112244 12-23 6 2
#5 1001 112244 12-23 7 2
#6 1001 112244 12-23 8 2
#7 1002 112254 12-23 3 1
#8 1002 112254 12-23 4 1
#9 1002 112254 12-23 5 1
#10 1002 112264 12-23 6 2
#11 1002 112264 12-23 7 2
#12 1002 112264 12-23 8 2
或另一个选项是duplicated
df1 %>%
group_by(ID) %>%
mutate(ENCOUNTER = cumsum(!duplicated(VISIT_ID)))
或使用data.table
library(data.table)
setDT(df1)[, ENCOUNTER := match(VISIT_ID, unique(VISIT_ID), by = ID]
或base R
with(df1, ave(VISIT_ID, ID, FUN = function(x) cumsum(!duplicated(x))))
答案 1 :(得分:1)
使用base R
ave
,我们可以将VISIT_ID
转换为factor
,然后将numeric
转换为VISIT_ID
ID
的唯一编号1}}
df$ENCOUNTER <- ave(df$VISIT_ID, df$ID,FUN = function(x) as.numeric(as.factor(x)))
df
# ID VISIT_ID DATE DV ENCOUNTER
#1 1001 112233 12-23 3 1
#2 1001 112233 12-23 4 1
#3 1001 112244 12-23 5 2
#4 1001 112244 12-23 6 2
#5 1001 112244 12-23 7 2
#6 1001 112244 12-23 8 2
#7 1002 112254 12-23 3 1
#8 1002 112254 12-23 4 1
#9 1002 112254 12-23 5 1
#10 1002 112264 12-23 6 2
#11 1002 112264 12-23 7 2
#12 1002 112264 12-23 8 2