我正在使用这个code来制作烛台。但是,它包含一个非常低效的循环(循环10K观察需要38秒)。它还使用rbind
函数,这意味着必须将日期转换为数字然后再转回,考虑到日期与时间的关系,这似乎不是直接的。
循环我试图用更有效的函数替换:
for(i in 1:nrow(prices)){
x <- prices[i, ]
# For high / low
mat <- rbind(c(x[1], x[3]),
c(x[1], x[4]),
c(NA, NA))
plot.base <- rbind(plot.base, mat)
}
输出是一个向量,第一个观察是输入数据的第一个(日期)和第三个col,第二个观察是输入数据的第一个和第四个col,第三个观察是两个NA。这些NAs对于绘图很重要。
实现这一目标的最有效方法是什么?
最小可重复的例子:
library(quantmod)
prices <- getSymbols("MSFT", auto.assign = F)
# Convert to dataframe
prices <- data.frame(time = index(prices),
open = as.numeric(prices[,1]),
high = as.numeric(prices[,2]),
low = as.numeric(prices[,3]),
close = as.numeric(prices[,4]),
volume = as.numeric(prices[,5]))
# Create line segments for high and low prices
plot.base <- data.frame()
for(i in 1:nrow(prices)){
x <- prices[i, ]
# For high / low
mat <- rbind(c(x[1], x[3]),
c(x[1], x[4]),
c(NA, NA))
plot.base <- rbind(plot.base, mat)
}
编辑:
dput(head(prices))
structure(list(time = structure(c(13516, 13517, 13518, 13521,
13522, 13523), class = "Date"), open = c(29.91, 29.700001, 29.629999,
29.65, 30, 29.799999), high = c(30.25, 29.969999, 29.75, 30.1,
30.18, 29.889999), low = c(29.4, 29.440001, 29.450001, 29.530001,
29.73, 29.43), close = c(29.860001, 29.809999, 29.639999, 29.93,
29.959999, 29.66), volume = c(76935100, 45774500, 44607200, 50220200,
44636600, 55017400)), .Names = c("time", "open", "high", "low",
"close", "volume"), row.names = c(NA, 6L), class = "data.frame")
答案 0 :(得分:4)
我会对在循环中增长对象的教程持谨慎态度。这是你在编程中可以做的最慢的操作之一。 (这就像购买一个货架,其中包含您的书籍所需的房间,然后在每次购买新书时更换货架。)
使用这样的子集:
res <- data.frame(date = rep(prices[, 1], each = 3),
y = c(t(prices[,c(3:4)])[c(1:2, NA),])) #transpose, subset, make to vector
res[c(FALSE, FALSE, TRUE), 1] <- NA
# date y
#1 2007-01-03 30.25
#2 2007-01-03 29.40
#3 <NA> <NA>
#4 2007-01-04 29.97
#5 2007-01-04 29.44
#6 <NA> <NA>
#7 2007-01-05 29.75
#8 2007-01-05 29.45
#9 <NA> <NA>
#10 2007-01-08 30.10
#11 2007-01-08 29.53
#12 <NA> <NA>
#13 2007-01-09 30.18
#14 2007-01-09 29.73
#15 <NA> <NA>
#16 2007-01-10 29.89
#17 2007-01-10 29.43
#18 <NA> <NA>