如何使用面板数据中R的前一年和前一年的平均值填充NA值?

时间:2017-12-20 04:20:34

标签: r panel

如何产生估算变量的结果? id1的2001年填补了2000年和2002年的平均值。

id             Year     A      imputed
1              2000     6       6
1              2001     NA      7
1              2002     8       8
1              2003     10      10
2              2000     2       2
2              2001     NA      5
2              2002     8       8
2              2003     5       5
3              2000     9       9 
3              2001     10      10
3              2002     NA      10.5
3              2003     11      12

1 个答案:

答案 0 :(得分:1)

希望这有帮助!

library(dplyr)
df %>%
  arrange(id,Year) %>%
  mutate(Imputed = ifelse(is.na(A), (lag(A)+lead(A))/2, A))

输出是:

   id Year  A Imputed
1   1 2000  6     6.0
2   1 2001 NA     7.0
3   1 2002  8     8.0
4   1 2003 10    10.0
5   2 2000  2     2.0
6   2 2001 NA     5.0
7   2 2002  8     8.0
8   2 2003  5     5.0
9   3 2000  9     9.0
10  3 2001 10    10.0
11  3 2002 NA    10.5
12  3 2003 11    11.0


#sample data
> dput(df)
structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L), Year = c(2000L, 2001L, 2002L, 2003L, 2000L, 2001L, 2002L, 
2003L, 2000L, 2001L, 2002L, 2003L), A = c(6L, NA, 8L, 10L, 2L, 
NA, 8L, 5L, 9L, 10L, NA, 11L)), .Names = c("id", "Year", "A"), class = "data.frame", row.names = c(NA, 
-12L))