用以前的值替换缺失值

时间:2013-02-01 21:25:18

标签: r zoo

Event,Time,Bid,Offer
Quote,0.458338,9.77,9.78
Order,0.458338,NA,NA
Order,0.458338,NA,NA
Order,0.458338,NA,NA
Quote,0.458363,9.78,9.79
Order,0.458364,NA,NA

我有这样的数据框架 我想编写一个有效的代码,用以前的报价出价填写NA并询问,时间是排序的,只有报价包含出价和询问字段(最好是矢量化)

所以它变成了

Event,Time,Bid,Offer
Quote,0.458338,9.77,9.78
Order,0.458338,9.77,9.78
Order,0.458338,9.77,9.78
Order,0.458338,9.77,9.78
Quote,0.458363,9.78,9.79
Order,0.458364,9.78,9.79

感谢

2 个答案:

答案 0 :(得分:20)

动物园包中的na.locf()功能是您的朋友。 locf代表“最后一个结转”。使用您的数据:

dat <- read.table(text = "Event,Time,Bid,Offer
Quote,0.458338,9.77,9.78
Order,0.458338,NA,NA
Order,0.458338,NA,NA
Order,0.458338,NA,NA
Quote,0.458363,9.78,9.79
Order,0.458364,NA,NA
", header = TRUE, sep = ",")

require(zoo)

dat2 <- transform(dat, Bid = na.locf(Bid), Offer = na.locf(Offer))

可生产。

> dat2
  Event     Time  Bid Offer
1 Quote 0.458338 9.77  9.78
2 Order 0.458338 9.77  9.78
3 Order 0.458338 9.77  9.78
4 Order 0.458338 9.77  9.78
5 Quote 0.458363 9.78  9.79
6 Order 0.458364 9.78  9.79

答案 1 :(得分:2)

试试这个:

# Last Observation Move Forward
na.lomf <- function(object, na.rm = F) {
    na.lomf.0 <- function(object) {
        idx <- which(!is.na(object))
        if (is.na(object[1])) idx <- c(1, idx)
        rep.int(object[idx], diff(c(idx, length(object) + 1)))
    }    
    dimLen <- length(dim(object))
    object <- if (dimLen == 0) na.lomf.0(object) else apply(object, dimLen, na.lomf.0)
    if (na.rm) na.trim(object, sides = "left", is.na = "all") else object
}