R xts和data.table

时间:2012-02-15 14:30:27

标签: r xts data.table

我可以像使用data.frame一样将data.table转换为xts对象:

> df = data.frame(x = c("a", "b", "c", "d"), v = rnorm(4))
> dt = data.table(x = c("a", "b", "c", "d"), v = rnorm(4))
> xts(df, as.POSIXlt(c("2011-01-01 15:30:00", "2011-01-02 15:30:00", "2011-01-03 15:50:50", "2011-01-04 15:30:00")))
                    x   v           
2011-01-01 15:30:00 "a" "-1.2232283"
2011-01-02 15:30:00 "b" "-0.1654551"
2011-01-03 15:50:50 "c" "-0.4456202"
2011-01-04 15:30:00 "d" "-0.9416562"
> xts(dt, as.POSIXlt(c("2011-01-01 15:30:00", "2011-01-02 15:30:00", "2011-01-03 15:50:50", "2011-01-04 15:30:00")))
                    x   v           
2011-01-01 15:30:00 "a" " 1.3089579"
2011-01-02 15:30:00 "b" "-1.7681071"
2011-01-03 15:50:50 "c" "-1.4375100"
2011-01-04 15:30:00 "d" "-0.2467274"

使用带有xts的data.table是否有任何问题?

2 个答案:

答案 0 :(得分:18)

只是为了解决一个悬而未决的问题。

正如文森特在评论中指出的那样,没有任何问题。

它包含在data.table 1.9.5中。以下是类似的内容:

as.data.table.xts <- function(x, keep.rownames = TRUE){
  stopifnot(requireNamespace("xts") || !missing(x) || xts::is.xts(x))
  r = setDT(as.data.frame(x), keep.rownames = keep.rownames)
  if(!keep.rownames) return(r[])
  setnames(r,"rn","index")
  setkeyv(r,"index")[]
}

as.xts.data.table <- function(x){
  stopifnot(requireNamespace("xts") || !missing(x) || is.data.table(x) || any(class(x[[1]] %in% c("POSIXct","Date"))))
  colsNumeric = sapply(x, is.numeric)[-1] # exclude first col, xts index
  if(any(!colsNumeric)){
    warning(paste("Following columns are not numeric and will be omitted:",paste(names(colsNumeric)[!colsNumeric],collapse=", ")))
  }
  r = setDF(x[,.SD,.SDcols=names(colsNumeric)[colsNumeric]])
  rownames(r) <- x[[1]]
  xts::as.xts(r)
}

答案 1 :(得分:8)

由于 quantmod ,通常会有一个xts,并在所有列名中嵌入符号。 (例如“SPY.Open”,“SPY.High”等)。所以,这里是Jan的as.data.table.xts的替代方案,它将符号放在一个单独的列中,这在data.table中更自然(因为在进行任何分析之前,您可能会考虑使用这些符号) )。

as.data.table.xts <- function(x, ...) {
  cn <- colnames(x)
  sscn <- strsplit(cn, "\\.")  
  indexClass(x) <- c('POSIXct', 'POSIXt') #coerce index to POSIXct
  DT <- data.table(time=index(x), coredata(x))
  #DT <- data.table(IDateTime(index(x)), coredata(x))

  ## If there is a Symbol embedded in the colnames, strip it out and make it a 
  ## column
  if (all(sapply(sscn, "[", 1) == sscn[[1]][1])) {
    Symbol <- sscn[[1]][1]
    setnames(DT, names(DT)[-1], sub(paste0(Symbol, "."), "", cn))
    DT <- DT[, Symbol:=Symbol]
    setkey(DT, Symbol, time)[]
  } else {
    setkey(DT, time)[]
  }
}

library(quantmod)
getSymbols("SPY")
as.data.table(SPY)
            time   Open   High    Low  Close   Volume Adjusted Symbol
   1: 2007-01-03 142.25 142.86 140.57 141.37 94807600   120.36    SPY
   2: 2007-01-04 141.23 142.05 140.61 141.67 69620600   120.61    SPY
   3: 2007-01-05 141.33 141.40 140.38 140.54 76645300   119.65    SPY
   4: 2007-01-08 140.82 141.41 140.25 141.19 71655000   120.20    SPY
   5: 2007-01-09 141.31 141.60 140.40 141.07 75680100   120.10    SPY
  ---                                                                
1993: 2014-12-01 206.30 206.60 205.38 205.64 12670100   205.64    SPY
1994: 2014-12-02 205.81 207.34 205.78 207.09 72105500   207.09    SPY
1995: 2014-12-03 207.30 208.15 207.10 207.89 69450000   207.89    SPY
1996: 2014-12-04 207.54 208.27 206.70 207.66 89928200   207.66    SPY
1997: 2014-12-05 207.87 208.47 207.55 208.00 85031000   208.00    SPY