Question

R中是否有一个好的包允许按时间序列中没有的时间子集（即索引）时间序列？例如。对于财务应用程序，通过不在数据库中的时间戳索引价格系列，应该在时间戳之前返回最新的可用价格。

在代码中，这就是我想要的

n =15
full.dates = seq(Sys.Date(), by = 'day', length = n)
series.dates = full.dates[c(1:10, 12, 15)] 
require(zoo)
series=zoo(rep(1,length(series.dates)), series.dates)
series[full.dates[11]]

返回

Data:
numeric(0)

Index:
character(0)

但是，我希望这返回full.dates [11]之前的最后一个现有日期的值，即full.dates [10]：

series[full.dates[10]]
2014-01-03 
     1

由于

Answer 1

您可以使用index提取zoo对象中观察结果的索引。然后，索引可用于对对象进行子集化。一步一步地显示逻辑（如果我理解正确的话，你只需要最后一步）：

# the index of the observations, here dates
index(series)

# are the dates smaller than your reference date?
index(series) < full.dates[11]

# subset observations: dates less than reference date
series[index(series) < full.dates[11]]

# select last observation before reference date:
tail(series[index(series) < full.dates[11]], 1)

# 2014-01-03 
#          1

可能的替代方案可能是扩展您的时间序列，并使用na.locf和xout参数“使用最新的非NA替换[e]每个NA”（另请参阅{{1} }和?na.locf以及this answer）

?approx

如果您希望将不完整系列中的缺失值替换为“向后观察后退”，则需要# expand time series to the range of dates in 'full.dates' series2 <- na.locf(series, xout = full.dates) series2 # select observation at reference date series2[full.dates[10]] # 2014-01-03 # 1您的系列与'虚拟'动物园对象，其中包含所需的连续日期范围

merge

Answer 2

na.locf(x, xout = newdate)似乎并不比下标更糟糕，但无论如何我们定义了"zoo"的子类"zoo2"，其中[使用na.locf。这是一个未经测试的最小实现，但可以扩展：

as.zoo2 <- function(x) UseMethod("as.zoo2")
as.zoo2.zoo <- function(x) structure(x, class = c("zoo2", setdiff(class(x), "zoo2")))
"[.zoo2" <- function(x, i, ...) {
    if (!missing(i) && inherits(i, class(index(x)))) {
        zoo:::`[.zoo`(na.locf(x, xout = i),, ...)
    } else as.zoo2(zoo:::`[.zoo`(x, i, ...))
}

这给出了：

> series2 <- as.zoo2(series)
> series2[full.dates[11]]
2014-01-04 
         1

Answer 3

如果不存在所需的索引值，我强烈认为子集函数应该不返回前一行。子集函数应该返回用户请求的内容;他们不应该假设用户想要的东西与他们要求的不同。

如果这是您想要的，那么您可以使用if语句轻松处理它。

series.subset <- series[full.dates[11]] if(NROW(series.subset)==0) { # merge series with an empty zoo object # that contains the index value you want prior <- merge(series, zoo(,full.dates[11])) # lag *back* one period so the NA is on the prior value prior <- lag(prior, 1) # get the index value at the prior value prior <- index(prior)[is.na(prior)] # subset again series.subset <- series[prior] }

动物园系列子集不在系列中

3 个答案: