我正在尝试获取下面的输出,但是我不知道如何在R中获取它。这是我拥有的数据:
ID PERIOD rating
8 0 3
8 1 3
8 2 2
8 3 F
8 4 3
8 5 F
8 6 1
9 0 2
9 1 2
9 2 1
下面是我想要的输出。
ID PERIOD rating
8 0 3
8 1 3
8 2 2
8 3 F
8 4 F
8 5 F
8 6 F
9 0 2
9 1 2
9 2 1
如您所见,只要特定ID的评分达到“ F”,该ID的评分就应保持为“ F”。我不知道该如何编码。任何帮助将不胜感激。
答案 0 :(得分:4)
使用data.table:
setDT(data)
data[, rating := ifelse(cumsum(rating == "F") >= 1, "F", rating), by = ID]
data
ID PERIOD rating
1: 8 0 3
2: 8 1 3
3: 8 2 2
4: 8 3 F
5: 8 4 F
6: 8 5 F
7: 8 6 F
8: 9 0 2
9: 9 1 2
10: 9 2 1
哪里
data <- data.frame(
ID = c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L),
PERIOD = c(0L, 1L, 2L, 3L, 4L, 5L, 6L, 0L, 1L, 2L),
rating = c("3", "3", "2", "F", "3", "F", "1", "2", "2", "1"),
stringsAsFactors = FALSE
)
编辑
长话短说,但这可以变得更加简洁:
data[, rating := ifelse(cumsum(rating == "F"), "F", rating), by = ID]
编辑2
正如罗纳克(Ronak)的建议,您可以使用基本R附带的ave()进行以下操作:
data$rating <-
ifelse(ave(data$rating == "F", data$ID, FUN = cumsum), "F", data$rating)
答案 1 :(得分:1)
这是一个选择
library(dplyr)
data %>%
group_by(ID) %>%
mutate(rating = replace(rating, row_number() > which(rating == "F")[1], "F" ))
# A tibble: 10 x 3
# Groups: ID [2]
# ID PERIOD rating
# <int> <int> <chr>
# 1 8 0 3
# 2 8 1 3
# 3 8 2 2
# 4 8 3 F
# 5 8 4 F
# 6 8 5 F
# 7 8 6 F
# 8 9 0 2
# 9 9 1 2
#10 9 2 1