我有一个xts序列的股票交易事件,我想处理它以生成1分钟的OHLC时间序列。例如这组交易:
Timestamp Price Size
9:30:00.123 12.32 200
9:30.00.532 12.21 100
9:30.32.352 12.22 500
9:30.45.342 12.35 200
应该导致9:30:00记录:
Timestamp Open High Low Close
9:30:00 12.32 12.35 12.21 12.35
我接近这个的方法是按分钟分割原始交易系列:
myminseries = do.call(rbind, lapply(split(mytrades, "minutes"), myminprocessing))
这产生了我想要的记录,但是有一个问题:如果股票在给定的一分钟内没有任何交易,我将完全错过那一分钟记录。我想要的是为缺少的交易分钟提供全0记录。例如,如果在9:31:00没有任何交易我应该:
Timestamp Open High Low Close
9:30:00 12.32 12.35 12.21 12.35
9:31:00 0 0 0 0
9:32:00 12.40 12.42 12.38 12.42
如何回填1分钟系列?或者我应该使用与split()完全不同的方法?
答案 0 :(得分:6)
如果在给定的分钟内没有交易,to.minutes
“将完全错过该分钟记录”。你可以通过合并零宽度,严格规则的xts系列来解决这个问题。
## Make sample data
> x <- xts(cumsum(rnorm(600, 0, 0.2)), Sys.time() - 600:1) # 10 minutes of secondly data
> # remove all data from a couple different minutes
> x['2012-03-19 17:33'] <- NA
> x['2012-03-19 17:35'] <- NA
> x <- na.omit(x)
>
> ## Convert to minutes
> xm <- to.minutes(x)
> head(xm)
x.Open x.High x.Low x.Close
2012-03-19 17:31:59 0.1945049 1.661000 -0.35943057 1.6610000
2012-03-19 17:32:59 1.7283877 1.728388 -0.69288918 1.1398868
2012-03-19 17:34:59 2.0529582 2.603881 -0.80532315 -0.8053232
2012-03-19 17:36:59 0.5314270 1.189609 -0.94996548 0.5807342
2012-03-19 17:37:59 0.3761700 1.943363 0.04046976 0.9101720
2012-03-19 17:38:59 1.0614807 1.722110 -0.22147145 1.4075637
> axm <- align.time(xm) #align times to begining of next period
>
> # to make strictly regular, create an xts object that has values for each minute
> tmp <- xts(, seq.POSIXt(start(axm), end(axm), by='min'))
> out <- cbind(tmp, axm)
> out
x.Open x.High x.Low x.Close
2012-03-19 17:32:00 0.19450494 1.66100005 -0.35943057 1.66100005
2012-03-19 17:33:00 1.72838773 1.72838773 -0.69288918 1.13988679
2012-03-19 17:34:00 NA NA NA NA
2012-03-19 17:35:00 2.05295818 2.60388093 -0.80532315 -0.80532315
2012-03-19 17:36:00 NA NA NA NA
2012-03-19 17:37:00 0.53142696 1.18960858 -0.94996548 0.58073422
2012-03-19 17:38:00 0.37616997 1.94336348 0.04046976 0.91017202
2012-03-19 17:39:00 1.06148070 1.72211018 -0.22147145 1.40756366
2012-03-19 17:40:00 1.28437005 1.28437005 -0.62691689 -0.62691689
2012-03-19 17:41:00 -0.56820166 0.90339983 -0.77554869 0.26101945
2012-03-19 17:42:00 -0.07443971 -0.07443971 -0.07443971 -0.07443971
> na.locf(out)
x.Open x.High x.Low x.Close
2012-03-19 17:32:00 0.19450494 1.66100005 -0.35943057 1.66100005
2012-03-19 17:33:00 1.72838773 1.72838773 -0.69288918 1.13988679
2012-03-19 17:34:00 1.72838773 1.72838773 -0.69288918 1.13988679
2012-03-19 17:35:00 2.05295818 2.60388093 -0.80532315 -0.80532315
2012-03-19 17:36:00 2.05295818 2.60388093 -0.80532315 -0.80532315
2012-03-19 17:37:00 0.53142696 1.18960858 -0.94996548 0.58073422
2012-03-19 17:38:00 0.37616997 1.94336348 0.04046976 0.91017202
2012-03-19 17:39:00 1.06148070 1.72211018 -0.22147145 1.40756366
2012-03-19 17:40:00 1.28437005 1.28437005 -0.62691689 -0.62691689
2012-03-19 17:41:00 -0.56820166 0.90339983 -0.77554869 0.26101945
2012-03-19 17:42:00 -0.07443971 -0.07443971 -0.07443971 -0.07443971
或者,如果在没有值时你真的想要零,你可以做out[is.na(out)] <- 0
答案 1 :(得分:3)
有to.period()
个函数,例如xts中的to.minute()
可以执行此操作。
德克