在data.frame中,如果要满足条件,我想用先前的年龄值“填充” NA。
>x <- data.frame(
ID = c(1,1,1,1,2,2,2,2,3,3,4,4,4,4),
YEAR = c(2016,2017,2018,2019,2016,2017,2018,2019,2016,2018,2016,2017,2018,2019),
AGE = c("ADULT", NA, NA, NA, "ADULT", NA, "ADULT", NA, "JUVENILE", NA, "JUVENILE", "ADULT", NA, NA)
)
>x
ID YEAR AGE
1 1 2016 ADULT
2 1 2017 <NA>
3 1 2018 <NA>
4 1 2019 <NA>
5 2 2016 ADULT
6 2 2017 <NA>
7 2 2018 ADULT
8 2 2019 <NA>
9 3 2016 JUVENILE
10 3 2018 <NA>
11 4 2016 JUVENILE
12 4 2017 ADULT
13 4 2018 <NA>
14 4 2019 <NA>
如果是成人,我想用下一个年龄填写下一年的年龄。但是,如果ID首次出现的年龄是JUVENILE,那么我想在接下来的几年中使用ADULT来填充年龄。
我尝试了一些方法,但是没有找到根据第一次出现进行调节的解决方案。
x.age.ok <- x %>% group_by(NUM_PIT, YEAR) %>% fill(AGE, .direction = "down")
我获得了:
>x.age.ok
ID YEAR AGE
1 1 2016 ADULT
2 1 2017 ADULT
3 1 2018 ADULT
4 1 2019 ADULT
5 2 2016 ADULT
6 2 2017 ADULT
7 2 2018 ADULT
8 2 2019 ADULT
9 3 2016 JUVENILE
10 3 2018 JUVENILE
11 4 2016 JUVENILE
12 4 2017 ADULT
13 4 2018 ADULT
14 4 2019 ADULT
但是我想要这个(以**突出显示):
>x.age.ok
ID YEAR AGE
1 1 2016 ADULT
2 1 2017 ADULT
3 1 2018 ADULT
4 1 2019 ADULT
5 2 2016 ADULT
6 2 2017 ADULT
7 2 2018 ADULT
8 2 2019 ADULT
9 3 2016 JUVENILE
10 3 2018 **ADULT**
11 4 2016 JUVENILE
12 4 2017 ADULT
13 4 2018 ADULT
14 4 2019 ADULT
想法?我们可以将if
放在mutate
中吗?
答案 0 :(得分:0)
也许您可以尝试:
library(dplyr)
x %>%
arrange(ID, YEAR) %>%
group_by(ID) %>%
mutate(AGE = if(first(AGE) == "JUVENILE") replace(AGE, is.na(AGE), "ADULT")
else replace(AGE, is.na(AGE), first(AGE)))
# ID YEAR AGE
# <dbl> <dbl> <fct>
# 1 1 2016 ADULT
# 2 1 2017 ADULT
# 3 1 2018 ADULT
# 4 1 2019 ADULT
# 5 2 2016 ADULT
# 6 2 2017 ADULT
# 7 2 2018 ADULT
# 8 2 2019 ADULT
# 9 3 2016 JUVENILE
#10 3 2018 ADULT
#11 4 2016 JUVENILE
#12 4 2017 ADULT
#13 4 2018 ADULT
#14 4 2019 ADULT
如果first
AGE
的值为"JUVENILE"
,我们将NA
中的所有"ADULT"
值替换为{{1 }}值。