我有以下数据框:
library(dplyr)
dat <- data_frame(id = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 5L, 5L, 7L, 7L, 7L, 8L, 8L, 8L, 10L),
wish1 = c(4L, NA, NA, 1L, NA, 1L, NA, NA, NA,
NA, -1L, 8L, NA, 1L, -1L, NA, 4L,
NA, NA, -1L),
wish2 = c(1L, NA, NA, 1L, NA, 1L, NA, NA, NA,
NA, -1L, 1L, NA, 2L, -1L, NA, 2L, NA, NA, 1L),
participate = c(NA, 1L, NA, NA, 1L, NA, NA, 1L, NA, NA, NA,
NA, 1L, NA, 4L, NA, NA, NA, 1L, NA))
我想在每个组中将变量NA
的{{1}}替换为同一组中可用的值。如果组中没有值,则participate
可以保留。
我需要类似的东西:
NA
不幸的是,如果没有像df <- data %>% group_by(id) %>%
mutate(participate = (participate, na.rm = TRUE))
或其他任何功能这样的功能,这是行不通的。
答案 0 :(得分:2)
可能有更简洁或更优雅的方式,但我想分享一些想法。
library(tidyr)
# the fill function can fill the NA based on the previous entry
dat2 <- dat %>%
arrange(id, participate) %>%
group_by(id) %>%
fill(participate)
# dat_temp is a summary data frame showing the fill values
dat_temp <- dat %>%
arrange(id, participate) %>%
group_by(id) %>%
slice(1) %>%
select(id, participate)
# Join dat_temp to dat2
dat2 <- dat %>%
left_join(dat_temp, by = "id") %>%
select(-participate.x) %>%
rename(participate = participate.y)
此解决方案基于alistaire的评论
dat2 <- dat %>%
arrange(id, participate) %>%
group_by(id) %>%
mutate(participate = first(participate))