学习R(包括xts和quantmod包)。有我的数据集:
str(h2)
‘zoo’ series from 2016-06-15 11:00:00 to 2016-09-15 14:00:00
Data: num [1:928, 1:5] 67842 67486 67603 67465 67457 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:5] "X.OPEN." "X.HIGH." "X.LOW." "X.CLOSE." ...
Index: POSIXct[1:928], format: "2016-06-15 11:00:00" "2016-06-15 12:00:00" "2016-06-15 13:00:00" ...
first(h2, '1 day')
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
2016-06-15 11:00:00 67842 67842 67122 67488 262740
2016-06-15 12:00:00 67486 67610 67420 67603 288875
2016-06-15 13:00:00 67603 67608 67381 67466 323498
2016-06-15 14:00:00 67465 67484 67356 67455 168991
2016-06-15 15:00:00 67457 67460 67289 67361 174965
2016-06-15 16:00:00 67363 67381 67202 67317 195579
2016-06-15 17:00:00 67320 67465 67288 67397 230255
2016-06-15 18:00:00 67397 67436 67084 67099 469379
2016-06-15 19:00:00 67096 67198 66900 67058 264430
2016-06-15 20:00:00 67040 67094 66944 67092 110503
2016-06-15 21:00:00 67092 67158 66877 66992 83041
2016-06-15 22:00:00 66993 67110 66680 66909 386905
2016-06-15 23:00:00 66909 67269 66884 67126 143373
dput(first(h2, '1 day'))
structure(c(67842, 67486, 67603, 67465, 67457, 67363, 67320,
67397, 67096, 67040, 67092, 66993, 66909, 67842, 67610, 67608,
67484, 67460, 67381, 67465, 67436, 67198, 67094, 67158, 67110,
67269, 67122, 67420, 67381, 67356, 67289, 67202, 67288, 67084,
66900, 66944, 66877, 66680, 66884, 67488, 67603, 67466, 67455,
67361, 67317, 67397, 67099, 67058, 67092, 66992, 66909, 67126,
262740, 288875, 323498, 168991, 174965, 195579, 230255, 469379,
264430, 110503, 83041, 386905, 143373), .Dim = c(13L, 5L), .Dimnames = list(
NULL, c("X.OPEN.", "X.HIGH.", "X.LOW.", "X.CLOSE.", "X.VOL."
)), index = structure(c(1465977600, 1465981200, 1465984800,
1465988400, 1465992000, 1465995600, 1465999200, 1466002800, 1466006400,
1466010000, 1466013600, 1466017200, 1466020800), class = c("POSIXct",
"POSIXt"), tzone = ""), class = "zoo")
无法弄清楚如何解决,例如,这样的任务 - 在11:00比较差异的符号(X.CLOSE-X.OPEN)和差异(X.CLOSE(13: 00)-X.OPEN(12:00))样本中包含的所有日子
在解决这个问题期间,我看到2项:
1)。如何在特定时间访问数据?即例如,如何在我选择的12:00时获得X.OPEN。我尝试了不同的组合(见下面的代码)但没有结果(只有数据集的标题)
h2["T11:00:00/12:00:00"]
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
h2["2016-06-15 11:00:00 MSK"]
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
x <- h2["2016-06-15 11:00:00 MSK"] #'zoo' series (without observations)
x
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
h2["T12:00:00.000/T12:00:00.001"]
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
2)。之前我在Excel中制作了类似的算法,并通过bruteforce(逐步检查)数据集解决了这个问题。但是R具有矢量数据类型,它应该是更快,更方便的方法来解决这个任务。
答案 0 :(得分:0)
要从时间序列中获取特定时间的数据,请使用window
。
window(h2, start = as.POSIXct('2016-06-15 04:00'), end = as.POSIXct('2016-06-15 04:59'))
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 04:00:00 67842 67842 67122 67488 262740
window(h2, start = as.POSIXct('2016-06-15 06:00'), end = as.POSIXct('2016-06-15 08:00'))
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 06:00:00 67603 67608 67381 67466 323498
# 2016-06-15 07:00:00 67465 67484 67356 67455 168991
# 2016-06-15 08:00:00 67457 67460 67289 67361 174965
您也可以按行号访问。
h2[1,]
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 04:00:00 67842 67842 67122 67488 262740
h2[4:6,]
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 07:00:00 67465 67484 67356 67455 168991
# 2016-06-15 08:00:00 67457 67460 67289 67361 174965
# 2016-06-15 09:00:00 67363 67381 67202 67317 195579
采取X.CLOSE之间的区别。和X.OPEN。在任何时候,你都可以......
h2_diff <- h2[,'X.CLOSE.'] - h2[,'X.OPEN.']
# 2016-06-15 04:00:00 2016-06-15 05:00:00 2016-06-15 06:00:00 2016-06-15 07:00:00
# -354 117 -137 -10
# 2016-06-15 08:00:00 2016-06-15 09:00:00 2016-06-15 10:00:00 2016-06-15 11:00:00
# -96 -46 77 -298
# 2016-06-15 12:00:00 2016-06-15 13:00:00 2016-06-15 14:00:00 2016-06-15 15:00:00
# -38 52 -100 -84
# 2016-06-15 16:00:00
# 217
要执行滞后差异,您可以使用lag
功能。
h2[1:3,'X.CLOSE.']
# 2016-06-15 04:00:00 2016-06-15 05:00:00 2016-06-15 06:00:00
# 67488 67603 67466
lag(h2[1:3,'X.CLOSE.'])
# 2016-06-15 04:00:00 2016-06-15 05:00:00
# 67603 67466
lag(h2[1:3,'X.CLOSE.'], -1)
# 2016-06-15 05:00:00 2016-06-15 06:00:00
# 67488 67603
所以要做X.CLOSE(n) - X.OPEN(n - 1)
...
h2[,'X.CLOSE.'] - lag(h2[,'X.OPEN.'], -1)
# 2016-06-15 05:00:00 2016-06-15 06:00:00 2016-06-15 07:00:00 2016-06-15 08:00:00
# -239 -20 -148 -104
# 2016-06-15 09:00:00 2016-06-15 10:00:00 2016-06-15 11:00:00 2016-06-15 12:00:00
# -140 34 -221 -339
# 2016-06-15 13:00:00 2016-06-15 14:00:00 2016-06-15 15:00:00 2016-06-15 16:00:00
# -4 -48 -183 133