用动物园读取R中的CSV

时间:2016-07-03 16:56:36

标签: r csv xts zoo

我有以下格式的CSV:

TICKER,PER,DATE,TIME,CLOSE
SYMBOL,1,20160104,1002,14180.0000000
SYMBOL,1,20160104,1003,14241.0000000

我想把它读成一个时间序列:

f <- function(a, b) {
    c <- paste(a, b)
    return(strptime(c, format = "%Y%m%d %H%M"))
}
d <- read.zoo("test.csv", FUN = f, index.column = list("DATE", "TIME"))

我得到的是index does not match data。为什么呢?

2 个答案:

答案 0 :(得分:3)

字符和数字列不能同时是时间序列数据的一部分,因为动物园对象的数据部分是一个矩阵(矩阵必须全部是数字,所有字符或所有其他类型);但是,可以使用split=在字符列上拆分为宽格式。此外,我们可以通过指定fformat=来避免必须指定函数tz=。此外,我们必须指定标题存在(header=),并且字段用“,”字符(sep=)分隔。

(下面我们使用text = Lines进行了再现,但实际上用"test.csv"替换了它。)

Lines <- "TICKER,PER,DATE,TIME,CLOSE
SYMBOL,1,20160104,1002,14180.0000000
SYMBOL,1,20160104,1003,14241.0000000"

library(zoo)

read.zoo(text = Lines, header = TRUE, sep = ",", index = c("DATE", "TIME"), 
  split = "TICKER", format = "%Y%m%d %H%M", tz = "")

,并提供:

                    PER CLOSE
2016-01-04 10:02:00   1 14180
2016-01-04 10:03:00   1 14241

注意:如果您确实想要使用您的函数f,请忽略formattz并使用:

read.zoo(text = Lines, header = TRUE, sep = ",", index = c("DATE", "TIME"), 
  split = "TICKER", FUN = f)

这也可以,即将其读入数据框,然后将数据框读入动物园对象:

DF <- read.csv(text = Lines) # read.csv defaults to header=TRUE, sep=","
read.zoo(DF, index = c("DATE", "TIME"), split = "TICKER", FUN = f)

答案 1 :(得分:1)

您需要指定header = TRUEsep = ",",因为它们不是read.zoo的默认值,就像read.csv一样。

d <- read.zoo(text="TICKER,PER,DATE,TIME,CLOSE
SYMBOL,1,20160104,1002,14180.0000000
SYMBOL,1,20160104,1003,14241.0000000",
  FUN = f, index.column = list("DATE", "TIME"),
  header=TRUE, sep=",")
d
#                     TICKER PER CLOSE
# 2016-01-04 10:02:00 SYMBOL 1   14180
# 2016-01-04 10:03:00 SYMBOL 1   14241