Question

我有一个互为排列的xts个对象列表。我想将merge列表放入一个大的xts对象中。我这样做的尝试是“

merged_reg_1_min_prices <- do.call(cbind, reg_1_min_prices)

然而，这似乎耗尽了内存。 reg_1_min_prices在相互排斥的日子里返回6,000天1分钟，所以它不是很大。有谁知道怎么解决这个问题？

要明确：reg_1_min_prices包含互相排斥的日子，每天的价格为1分钟，列表中的每个条目都是xts个对象。

Answer 1

我使用Dominik中的his answer提供的策略this question

我已将其转换为function包中的qmao。此代码也是getSymbols.FI中FinancialInstrument package的核心。

do.call.rbind <- function(lst) {
  while(length(lst) > 1) {
    idxlst <- seq(from=1, to=length(lst), by=2)
    lst <- lapply(idxlst, function(i) {
      if(i==length(lst)) { return(lst[[i]]) }
      return(rbind(lst[[i]], lst[[i+1]]))
    })
  }
  lst[[1]]
}

如果您想要rbind data.frames，@JoshuaUlrich提供了一个优雅的解决方案here

据我所知（不仔细观察）内存不是所提供的三种解决方案中的任何一种问题（@JoshuaUlrich's，@Alex's和qmao :: do.call.rbind）。所以，它归结为速度......

library(xts)
l <- lapply(Sys.Date()-6000:1, function(x) {
    N=60*8;xts(rnorm(N),as.POSIXct(x)-seq(N*60,1,-60))})
GS <- do.call.rbind
JU <- function(x) Reduce(rbind, x)
Alex <- function(x) do.call(rbind, lapply(x, as.data.frame)) #returns data.frame, not xts

identical(GS(l), JU(l)) #TRUE

library(rbenchmark)
benchmark(GS(l), JU(l), Alex(l), replications=1)
     test replications elapsed relative user.self sys.self user.child sys.child
3 Alex(l)            1  89.575 109.9080    56.584   33.044          0         0
1   GS(l)            1   0.815   1.0000     0.599    0.216          0         0
2   JU(l)            1 209.783 257.4025   143.353   66.555          0         0

do.call.rbind明显胜出。

Answer 2

您不想使用merge，因为这将返回一个6000列对象，每个列表元素中的每一行都有一行（在我的示例中为2,880,000）。大多数值都是NA。 cbind.xts只需使用一些默认参数值调用merge.xts，因此您也不想使用它。

我们知道通过rbind.xts调用do.call导致的内存问题。 Jeff确实拥有更高效的代码，但它是一个不公开的原型。

@GSee's solution的替代方法是使用Reduce。这需要一段时间才能在我的笔记本电脑上运行，但即使只有4GB，内存也不是问题。

library(xts)
l <- lapply(Sys.Date()-6000:1, function(x) {
  N=60*8;xts(rnorm(N),as.POSIXct(x)-seq(N*60,1,-60))})
x <- Reduce(rbind, l)

Answer 3

以下是有效执行此操作的方法：将每个xts对象转换为data.frame，然后将rbind转换为xts。这根本不会提高内存使用量。如有必要，只需从data.frame

创建一个新的{{1}}对象

合并大量的xts对象

3 个答案: