数据插值以间隙长度为条件

时间:2016-12-14 15:24:45

标签: r time-series interpolation na

我想使用样条方法插入时间序列。我想使用间隙容差'如果有> x NA的连续天数,数据将保留为NA而不进行插值。在我的例子中,假设连续几天连续超过NAs,我将不会进行插值。示例数据:

x <- seq(as.Date("2016-01-01"),as.Date("2016-01-31"),by="day")
y <- c(0.45062130 ,0.51136174 ,NA ,NA ,0.29481738 ,NA ,0.27713756 ,0.62638512 ,0.23547530,0.29253901 ,0.75899501 ,0.67779756 ,0.51831742 ,0.08050147 ,0.71183739 ,NA ,0.79406706 ,NA,0.03434758 ,0.59573892 ,0.22102821 ,0.13154414 ,NA ,NA ,NA ,NA ,0.23692593,0.95215104 ,0.38810846 ,0.17970580 ,0.05176054)

df <- data.frame(x,y)

> df
            x          y
 2016-01-01 0.45062130
 2016-01-02 0.51136174
 2016-01-03         NA
 2016-01-04         NA
 2016-01-05 0.29481738
 2016-01-06         NA
 2016-01-07 0.27713756
 2016-01-08 0.62638512
 2016-01-09 0.23547530
 2016-01-10 0.29253901
 2016-01-11 0.75899501
 2016-01-12 0.67779756
 2016-01-13 0.51831742
 2016-01-14 0.08050147
 2016-01-15 0.71183739
 2016-01-16         NA
 2016-01-17 0.79406706
 2016-01-18         NA
 2016-01-19 0.03434758
 2016-01-20 0.59573892
 2016-01-21 0.22102821
 2016-01-22 0.13154414
 2016-01-23         NA
 2016-01-24         NA
 2016-01-25         NA
 2016-01-26         NA
 2016-01-27 0.23692593
 2016-01-28 0.95215104
 2016-01-29 0.38810846
 2016-01-30 0.17970580
 2016-01-31 0.05176054

我想到的是创建2个新数据框。第一个是完全插值的,第二个是在间隙公差下删除NAs,然后合并。有一个更好的方法吗?

我想要的数据集如下所示:

> df
            x          y
 2016-01-01 0.45062130
 2016-01-02 0.51136174
 2016-01-03 0.35684617
 2016-01-04 0.30481738
 2016-01-05 0.29481738
 2016-01-06 0.28481738
 2016-01-07 0.27713756
 2016-01-08 0.62638512
 2016-01-09 0.23547530
 2016-01-10 0.29253901
 2016-01-11 0.75899501
 2016-01-12 0.67779756
 2016-01-13 0.51831742
 2016-01-14 0.08050147
 2016-01-15 0.71183739
 2016-01-16 0.75158886
 2016-01-17 0.79406706
 2016-01-18 0.21584455
 2016-01-19 0.03434758
 2016-01-20 0.59573892
 2016-01-21 0.22102821
 2016-01-22 0.13154414
 2016-01-23         NA
 2016-01-24         NA
 2016-01-25         NA
 2016-01-26         NA
 2016-01-27 0.23692593
 2016-01-28 0.95215104
 2016-01-29 0.38810846
 2016-01-30 0.17970580
 2016-01-31 0.05176054

1 个答案:

答案 0 :(得分:1)

在zoo包中尝试na.spline。 (fortify.zoo(z)会将z转换回数据框,尽管您可能希望将其保留为动物园形式以利用其他设施。)另请查看动物园中的其他na。*函数

library(zoo)
z <- na.spline(zoo(y, x), maxgap = 2)

,并提供:

> z
2016-01-01 2016-01-02 2016-01-03 2016-01-04 2016-01-05 2016-01-06 2016-01-07 
0.45062130 0.51136174 0.50365727 0.43252778 0.29481738 0.14613360 0.27713756 
2016-01-08 2016-01-09 2016-01-10 2016-01-11 2016-01-12 2016-01-13 2016-01-14 
0.62638512 0.23547530 0.29253901 0.75899501 0.67779756 0.51831742 0.08050147 
2016-01-15 2016-01-16 2016-01-17 2016-01-18 2016-01-19 2016-01-20 2016-01-21 
0.71183739 1.06652092 0.79406706 0.17526465 0.03434758 0.59573892 0.22102821 
2016-01-22 2016-01-23 2016-01-24 2016-01-25 2016-01-26 2016-01-27 2016-01-28 
0.13154414         NA         NA         NA         NA 0.23692593 0.95215104 
2016-01-29 2016-01-30 2016-01-31 
0.38810846 0.17970580 0.05176054