遵循趋势填充R中的缺失值

时间:2018-09-19 11:18:03

标签: r function trend

我有一个参考数据集:

lookup = structure(list(v = c(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26), TI = c(0.913066666666667, 
0.70784, 0.584704, 0.502613333333333, 0.443977142857143, 0.4, 
0.365795555555556, 0.338432, 0.316043636363636, 0.297386666666667, 
0.2816, 0.268068571428571, 0.256341333333333, 0.24608, 0.237025882352941, 
0.228977777777778, 0.221776842105263, 0.215296, 0.209432380952381, 
0.204101818181818, 0.199234782608696, 0.194773333333333, 0.1906688, 
0.18688)), class = "data.frame", row.names = c(NA, -24L))

将列TI的每个元素除以上一个元素:

library(dplyr)
trend = lookup$TI/lag(lookup$TI)

以此趋势为参考,我想在测试文件中填写NA值:

test = structure(list(events = c(5, 179, 256, 192, 117, 35, 35, 11, 
15, 3, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 0), v = c(3, 4, 
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 
22, 23, 24, 25, 26), TI = c(NA, 0.0795651909763371, 0.0587914615737312, 
0.0640542134644949, 0.0621684208232864, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, 
-24L), class = "data.frame")

在测试文件中,我想考虑趋势来替换test$TI = NA。这意味着:

test$TI[6] = test$TI[5]*trend[6]
test$TI[7] = test$TI[6]*trend[7]
...

对于test$TI[1],我应该将trend函数创建为:

library(dplyr)
    trend = lag(lookup$TI)/lookup$TI

然后:

test$TI[1] = test$TI[2]*trend[2]

我的问题是我怎么能自动做到?由于我有很多test文件,而NA的位置并不总是相同。

0 个答案:

没有答案