r zoo对象:不同列的不同填充

时间:2014-06-26 20:55:42

标签: r merge time-series zoo

我在r,a1和a2中有两个动物园对象,我想合并,然后填写缺失的值。我想用na.approx()填充a1的缺失值,我想用na.locf()填充a2的缺失值。我怎么能做到这一点?

一个例子:

a1 <- zoo(data.frame(b=c(1,3,4,6),c=c(2,6,8,12)),c(1,3,4,6))
  b  c
1 1  2
3 3  6
4 4  8
6 6 12
a2 <- zoo(data.frame(d=c(1,0)),c(2,5))
  d
2 1
5 0
a3 <- merge(a1,a2)
   b  c  d
1  1  2 NA
2 NA NA  1
3  3  6 NA
4  4  8 NA
5 NA NA  0
6  6 12 NA

现在我想进入a4:

   b  c  d
1  1  2 NA
2  2  4  1
3  3  6  1
4  4  8  1
5  5 10  0
6  6 12  0

a4 <- na.locf(a3)a4 <- na.approx(a3)都将所有列视为相同。我该如何单独处理这些列?或者我可以在合并之前解决这个问题?

提前致谢

**编辑:现在包含真实数据**

为了更详细地说明,这里有一些真实数据:

> dput(a5)
structure(c(133.7, NA, 133.345, NA, 134.2, NA, 135.721, 136.456, 
136.677, 137.347, 138.324, NA, 139.086, 139.622, 140.475, NA, 
141.179, 141.652, 141.811, 125.901, NA, 125.965, NA, 127.402, 
NA, 128.529, 128.797, 129.267, 130.08, 130.831, NA, 131.313, 
132.008, 132.85, NA, 133.416, 133.842, 133.986, NA, 0, NA, 1, 
NA, 0, NA, NA, NA, NA, NA, 1, NA, NA, NA, 0, NA, NA, NA), .Dim = c(19L, 
3L), .Dimnames = list(NULL, c("sup", "ret", "gas")), index = structure(c(1387242143, 
1387242156, 1387242158, 1387242169, 1387242173, 1387242186, 1387242188, 
1387242203, 1387242218, 1387242233, 1387242248, 1387242252, 1387242263, 
1387242278, 1387242293, 1387242305, 1387242308, 1387242323, 1387242338
), class = c("POSIXct", "POSIXt")), class = "zoo")
> a5
                        sup     ret gas
2013-12-16 19:02:23 133.700 125.901  NA
2013-12-16 19:02:36      NA      NA   0
2013-12-16 19:02:38 133.345 125.965  NA
2013-12-16 19:02:49      NA      NA   1
2013-12-16 19:02:53 134.200 127.402  NA
2013-12-16 19:03:06      NA      NA   0
2013-12-16 19:03:08 135.721 128.529  NA
2013-12-16 19:03:23 136.456 128.797  NA
2013-12-16 19:03:38 136.677 129.267  NA
2013-12-16 19:03:53 137.347 130.080  NA
2013-12-16 19:04:08 138.324 130.831  NA
2013-12-16 19:04:12      NA      NA   1
2013-12-16 19:04:23 139.086 131.313  NA
2013-12-16 19:04:38 139.622 132.008  NA
2013-12-16 19:04:53 140.475 132.850  NA
2013-12-16 19:05:05      NA      NA   0
2013-12-16 19:05:08 141.179 133.416  NA
2013-12-16 19:05:23 141.652 133.842  NA
2013-12-16 19:05:38 141.811 133.986  NA

Stu的解决方案并不能保持前两列数据的准确性。我希望前两列中的NA用na.approx()进行插值,最后一列用最后一次观察结转locf()进行插值。

> na.locf(ceiling(na.approx(a5)))
                    sup ret gas
2013-12-16 19:02:23 134 126  NA
2013-12-16 19:02:36 134 126   0
2013-12-16 19:02:38 134 126   1
2013-12-16 19:02:49 134 128   1
2013-12-16 19:02:53 135 128   1
2013-12-16 19:03:06 136 129   0
2013-12-16 19:03:08 136 129   1
2013-12-16 19:03:23 137 129   1
2013-12-16 19:03:38 137 130   1
2013-12-16 19:03:53 138 131   1
2013-12-16 19:04:08 139 131   1
2013-12-16 19:04:12 139 131   1
2013-12-16 19:04:23 140 132   1
2013-12-16 19:04:38 140 133   1
2013-12-16 19:04:53 141 133   1
2013-12-16 19:05:05 142 134   0
2013-12-16 19:05:08 142 134   0
2013-12-16 19:05:23 142 134   0
2013-12-16 19:05:38 142 134   0

**以上不是我需要的**

再次感谢

**编辑 - 显示我正在寻找的结果的解决方案**

> a6 <- cbind(na.approx(a5[,c("sup","ret")]),na.locf(a5[,c("gas")]))
> a6
                         sup      ret na.locf(a5[, c("gas")])
2013-12-16 19:02:23 133.7000 125.9010                      NA
2013-12-16 19:02:36 133.3923 125.9565                       0
2013-12-16 19:02:38 133.3450 125.9650                       0
2013-12-16 19:02:49 133.9720 127.0188                       1
2013-12-16 19:02:53 134.2000 127.4020                       1
2013-12-16 19:03:06 135.5182 128.3787                       0
2013-12-16 19:03:08 135.7210 128.5290                       0
2013-12-16 19:03:23 136.4560 128.7970                       0
2013-12-16 19:03:38 136.6770 129.2670                       0
2013-12-16 19:03:53 137.3470 130.0800                       0
2013-12-16 19:04:08 138.3240 130.8310                       0
2013-12-16 19:04:12 138.5272 130.9595                       1
2013-12-16 19:04:23 139.0860 131.3130                       1
2013-12-16 19:04:38 139.6220 132.0080                       1
2013-12-16 19:04:53 140.4750 132.8500                       1
2013-12-16 19:05:05 141.0382 133.3028                       0
2013-12-16 19:05:08 141.1790 133.4160                       0
2013-12-16 19:05:23 141.6520 133.8420                       0
2013-12-16 19:05:38 141.8110 133.9860                       0

仍然需要处理新对象的名称,但这很容易。

1 个答案:

答案 0 :(得分:1)

使用column-bind函数,同时将每列分组到所需的na.fill函数中:

  

a4 <- cbind(na.approx(a3[,1:2]), na.locf(a3[,3]))

它只有一行代码,您可以根据需要重新排列列。