我的代码
re$p_RID <- ifelse((re$group_UID & re$Amount_type=='Draw'),
shift(re$id), 'NA')
运行代码后,我的数据框如下所示:
id user_id Amount_type group_UID p_RID
30 11 Non 1 NA
31 11 Draw 1 30
54 5 Non 2 NA
322 5 Draw 2 54
21 5 Draw 2 322
13 5 Non 2 NA
2445 5 Draw 2 13
111 44 Non 3 NA
287 44 Draw 3 111
我想在每个p_RID列的非值的value_type的第一次出现时使用id。(在同一group_UID中,如果有多个多次出现的非值,则将它们中的每个作为第一次出现)结果应如下所示:
id user_id Amount_type group_UID p_RID
30 11 Non 1 NA
31 11 Draw 1 30
54 5 Non 2 NA
322 5 Draw 2 54
21 5 Draw 2 54 <- this is where I don't know how to edit
13 5 Non 2 NA
2445 5 Draw 2 13
111 44 Non 3 NA
287 44 Draw 3 111
答案 0 :(得分:0)
使用dplyr
的一种方法是group_by
group_UID
并出现"Non"
值,并将NA
分配给第一行和第一个id
否则每组。
library(dplyr)
df %>%
group_by(group_UID, group = cumsum(Amount_type == "Non")) %>%
mutate(p_RID = ifelse(row_number() == 1, NA, id[1L])) %>%
ungroup() %>%
select(-group)
# id user_id Amount_type group_UID p_RID
# <int> <int> <fct> <int> <int>
#1 30 11 Non 1 NA
#2 31 11 Draw 1 30
#3 54 5 Non 2 NA
#4 322 5 Draw 2 54
#5 21 5 Draw 2 54
#6 13 5 Non 2 NA
#7 2445 5 Draw 2 13
#8 111 44 Non 3 NA
#9 287 44 Draw 3 111
另一种方式是
df %>%
group_by(group_UID, group = cumsum(Amount_type == "Non")) %>%
mutate(p_RID = ifelse(Amount_type == "Non", NA, first(id))) %>%
ungroup() %>%
select(-group)
我们也可以在这里使用基数R ave
with(df, ave(id, group_UID, cumsum(Amount_type == "Non"), FUN = function(x)
ifelse(seq_along(x) == 1, NA, x[1L])))
#[1] NA 30 NA 54 54 NA 13 NA 111