我想创建一系列后续的相同值,例如data.table::rleid
。但是问题是我希望从序列中排除某些行,而应该排除的行可以由另一列定义。我发现data.table::rleid
可以使用两次,但是仍然不能达到理想的效果-见下文:
my_example <- structure(list(event = c(234, 234, 224, 232, 232, 201, 201, 201,
201, 201, 201, 201, 244, 244, 201, 201, 201, 244, 244, 212, 201,
201, 201, 249, 201, 201, 201, 201, 201, 201, 201, 249, 201, 201,
244, 244, 201, 261, 245, 203, 204, 204, 201, 201, 201, 201, 201,
201, 216, 201), subgroup = c(10L, 11L, 10L, 10L, 11L, 10L, 10L,
10L, 10L, 10L, 10L, 11L, 11L, 10L, 10L, 10L, 10L, 10L, 11L, 11L,
10L, 11L, 11L, 11L, 11L, 11L, 11L, 10L, 11L, 11L, 11L, 10L, 10L,
10L, 10L, 11L, 11L, 10L, 11L, 10L, 10L, 11L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 11L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -50L), .Names = c("event", "subgroup"))
my_example %>%
mutate(in_seq = ! event %in% c(224, 232, 234, 261),
seq = data.table::rleid(subgroup) * in_seq,
seq2 = data.table::rleid(seq))
# A tibble: 50 x 5
event subgroup in_seq seq seq2
<dbl> <int> <lgl> <int> <int>
1 234 10 F 0 1
2 234 11 F 0 1
3 224 10 F 0 1
4 232 10 F 0 1
5 232 11 F 0 1
6 201 10 T 5 2
7 201 10 T 5 2
8 201 10 T 5 2
9 201 10 T 5 2
10 201 10 T 5 2
# ... with 40 more rows
如何从计数中排除一些行? (在上面的示例中,这意味着第1行:5行以及第38行在seq2
中都具有NA)
答案 0 :(得分:1)
如果我们想将's2'中的值更改为NA
library(data.table)
my_example %>%
mutate(in_seq = ! event %in% c(224, 232, 234, 261),
s1 = rleid(subgroup * in_seq),
s2 = rleid(s1) * NA ^ !in_seq)
或者如果's2'需要从'1'开始,则跳过'in_seq'中的FALSE
my_example %>%
mutate(in_seq = ! event %in% c(224, 232, 234, 261),
s1 = data.table::rleid(subgroup) * in_seq,
s2 = (NA^!s1) * s1,
s2 = match(s2, unique(na.omit(s2))))
或者可能是
setDT(my_example)[, in_seq := !event %in% c(224, 232, 234, 261)
][, s1 := rleid(subgroup) * in_seq
][s1 != 0, s2 := rleid(s1)]