我现在有一个问题。 我尝试操作的列如下所示:
> DT <- data.table(Group= c("SM", NA, NA, NA, NA, NA, "GH", NA, NA, NA, NA, NA, NA, NA))
> DT
Group
1: SM
2: <NA>
3: <NA>
4: <NA>
5: <NA>
6: <NA>
7: GH
8: <NA>
9: <NA>
10: <NA>
11: <NA>
12: <NA>
13: <NA>
14: <NA>
我想用先前的值填充NA,但只填充特定数量的行,在这种情况下,仅填充4行,这意味着所需的结果是:
Group
1: SM
2: SM
3: SM
4: SM
5: SM
6: <NA>
7: GH
8: GH
9: GH
10: GH
11: GH
12: <NA>
13: <NA>
14: <NA>
我该如何实现?我尝试使用na.locf(),但是它没有执行我想要的操作。预先感谢
答案 0 :(得分:3)
这是使用dplyr
软件包的解决方案。
library(dplyr)
library(data.table)
# Set the threshold
threshold <- 4
DT2 <- DT %>%
mutate(Group_ID = cumsum(!is.na(Group))) %>%
group_by(Group_ID) %>%
mutate(ID = row_number() - 1) %>%
mutate(Group = ifelse(ID <= threshold, first(Group), NA_character_)) %>%
ungroup() %>%
select(Group)
DT2
# # A tibble: 14 x 1
# Group
# <chr>
# 1 SM
# 2 SM
# 3 SM
# 4 SM
# 5 SM
# 6 NA
# 7 GH
# 8 GH
# 9 GH
# 10 GH
# 11 GH
# 12 NA
# 13 NA
# 14 NA
答案 1 :(得分:3)
带有data.table
的选项为
library(data.table)
DT[, Group := Group[1][NA^(seq_len(.N) > 5)], cumsum(!is.na(Group))]
DT
# Group
# 1: SM
# 2: SM
# 3: SM
# 4: SM
# 5: SM
# 6: <NA>
# 7: GH
# 8: GH
# 9: GH
#10: GH
#11: GH
#12: <NA>
#13: <NA>
#14: <NA>
答案 2 :(得分:2)
这是一种实现方法:
> DT[, Group := ifelse(seq_len(.N) <= 1 + 4, Group[1], Group),by = cumsum(!is.na(Group))]
> DT
Group
1: SM
2: SM
3: SM
4: SM
5: SM
6: <NA>
7: GH
8: GH
9: GH
10: GH
11: GH
12: <NA>
13: <NA>
14: <NA>