如何产生估算变量的结果? id1的2001年填补了2000年和2002年的平均值。
id Year A imputed
1 2000 6 6
1 2001 NA 7
1 2002 8 8
1 2003 10 10
2 2000 2 2
2 2001 NA 5
2 2002 8 8
2 2003 5 5
3 2000 9 9
3 2001 10 10
3 2002 NA 10.5
3 2003 11 12
答案 0 :(得分:1)
希望这有帮助!
library(dplyr)
df %>%
arrange(id,Year) %>%
mutate(Imputed = ifelse(is.na(A), (lag(A)+lead(A))/2, A))
输出是:
id Year A Imputed
1 1 2000 6 6.0
2 1 2001 NA 7.0
3 1 2002 8 8.0
4 1 2003 10 10.0
5 2 2000 2 2.0
6 2 2001 NA 5.0
7 2 2002 8 8.0
8 2 2003 5 5.0
9 3 2000 9 9.0
10 3 2001 10 10.0
11 3 2002 NA 10.5
12 3 2003 11 11.0
#sample data
> dput(df)
structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L), Year = c(2000L, 2001L, 2002L, 2003L, 2000L, 2001L, 2002L,
2003L, 2000L, 2001L, 2002L, 2003L), A = c(6L, NA, 8L, 10L, 2L,
NA, 8L, 5L, 9L, 10L, NA, 11L)), .Names = c("id", "Year", "A"), class = "data.frame", row.names = c(NA,
-12L))