Question

假设我需要将MA（5）应用于存储在xts对象中的一批市场数据。我可以easily pull the subset of data我希望使用xts子集进行平滑处理：

x['2013-12-05 17:00:01/2013-12-06 17:00:00']

但是，我需要在我的子集中的第一个之前额外进行5次观察才能“填充”过滤器。有一个简单的方法吗？

我唯一能够弄清楚的是非常丑陋，有明确的行号（这里使用xts样本数据）：

require(xts)
data(sample_matrix)
x <- as.xts(sample_matrix)

x$rn <- row(x[,1])
frst <- first(x['2007-05-18'])$rn
finl <- last(x['2007-06-09'])$rn
ans <- x[(frst-5):finl,]

我可以说咩？有人帮助我。

更新：按流行请求，将一个MA（5）应用于sample_matrix中的每日数据的简短示例：

require(xts)
data(sample_matrix)
x <- as.xts(sample_matrix)$Close

calc_weights <- function(x) {
    ##replace rnorm with sophisticated analysis
    wgts <- matrix(rnorm(5,0,0.5), nrow=1)
    xts(wgts, index(last(x)))
}

smooth_days <- function(x, wgts) {
    w <- wgts[index(last(x))]
    out <- filter(x, w, sides=1)
    xts(out, index(x))
}

set.seed(1.23456789)
wgts <- apply.weekly(x, calc_weights)
lapply(split(x, f='weeks'), smooth_days, wgts)

为简洁起见，只有最后一周的输出：

[[26]]
                [,1]
2007-06-25        NA
2007-06-26        NA
2007-06-27        NA
2007-06-28        NA
2007-06-29 -9.581503
2007-06-30 -9.581208

NAs这是我的问题。我想重新计算每周数据的权重，并将这些新权重应用到即将到来的一周。冲洗，重复。在现实生活中，我用行索引替换了lapply一些丑陋的东西，但我确信有更好的方法。

为了明确定义问题，这似乎是在非重叠时间段（数周，在这种情况下）进行分析但需要重叠数据时间段（2周，在这种情况）进行计算。

Answer 1

这是使用endpoints和for循环执行此操作的一种方法。您仍然可以在我的评论中使用which.i=TRUE建议，但整数子集更快。

y <- x*NA                   # pre-allocate result
ep <- endpoints(x,"weeks")  # time points where parameters change

set.seed(1.23456789)
for(i in seq_along(ep)[-(1:2)]) {
  rng1 <- ep[i-1]:ep[i]          # obs to calc weights
  rng2 <- ep[i-2]:ep[i]          # "prime" obs
  wgts <- calc_weights(x[rng1])
  # calc smooth_days on rng2, but only keep rng1 results
  y[rng1] <- smooth_days(x[rng2], wgts)[index(x[rng1])]
}

将xts行前置到子集

1 个答案: