使用不同的插值技术填充时间序列数据中的NA

时间:2017-12-14 13:13:53

标签: r interpolation na

Time = c("7/16/2017 18:46", "7/16/2017 21:52", 
"7/16/2017 23:16", "7/17/2017 4:03", "7/17/2017 5:13", "7/17/2017 5:27", 
"7/17/2017 18:57", "7/17/2017 19:25", "7/17/2017 23:58", "7/18/2017 2:59", 
"7/18/2017 3:27", "7/18/2017 3:59")  
Flux = c(NA, NA, 4.51263406, 
NA, NA, 2.291454049, NA, 4.568703192, NA, NA, 3.392520428, NA
), int = c(403.5413091, 421.5796345, NA, 410.0796897, NA, NA, 
363.5271212, NA, NA, 398.9564539, NA, NA)  
corr = c(422.745436, 
447.6726631, NA, 420.4392183, NA, NA, 408.7056493, NA, NA, 421.8799971, 
NA, NA)  
dat = c(NA, NA, NA, NA, 2.316481462, NA, NA, NA, 7.11779784, 
NA, NA, 2.953349661)

df$Time <- as.POSIXct(strptime(df$Timestamp, format="%m/%d/%Y %H:%M"))

看起来像......

Time    Flux    int corr    dat    
7/16/2017 18:46 NA  403.5413091 422.745436  NA    
7/16/2017 21:52 NA  421.5796345 447.6726631 NA   
7/16/2017 23:16 4.51263406  NA  NA  NA  
7/17/2017 4:03  NA  410.0796897 420.4392183 NA  
7/17/2017 5:13  NA  NA  NA  2.316481462  
7/17/2017 5:27  2.291454049 NA  NA  NA  
7/17/2017 18:57 NA  363.5271212 408.7056493 NA  
7/17/2017 19:25 4.568703192 NA  NA  NA  
7/17/2017 23:58 NA  NA  NA  7.11779784  
7/18/2017 2:59  NA  398.9564539 421.8799971 NA  
7/18/2017 3:27  3.392520428 NA  NA  NA  
7/18/2017 3:59  NA  NA  NA  2.953349661  

我有四列(1个时间数据,3个连续数据)。我在每列中都有很多NA值。我想插入并填充所有列的NA。由于我不知道我需要哪种插值方法,我想要很多插值方法(线性,样条等)。我尝试了na.approx,但它没有用。

任何帮助?

2 个答案:

答案 0 :(得分:1)

如果要尝试比较所述的几种插值方法,可以使用 na.interpolation() 包中的imputeTS函数。

对于线性插值:

library("imputeTS")
na.interpolation(df, option = "linear")

对于样条插值:

library("imputeTS")
na.interpolation(df, option = "spline")

对于stineman插值:

library("imputeTS")
na.interpolation(df, option = "stine")

如您所见,您只需要调整options参数即可。

答案 1 :(得分:0)

df&lt; - fill(df,direction = c(names(df)))

但我不知道它用来填充NA的技术