我试图找到第一次'价格'高于另一个每天称为'dayhigh'的值。
我遇到了将此结果转换为时间序列对象的问题,因此我只使用POSIXlt类作为日期时间,参考日期在Date类中。 示例数据位于名为“example”的框架中:
day,datetime,price,dayhigh
2016-09-01,2016-09-01 15:00:00,1.11912,1.11990
2016-09-01,2016-09-01 15:00:00,1.13000,1.11990
2016-09-01,2016-09-01 15:00:01,1.11911,1.11990
2016-09-05,2016-09-05 15:00:00,1.11436,1.11823
2016-09-05,2016-09-05 15:00:01,1.11436,1.11823
2016-09-05,2016-09-05 15:00:01,1.11900,1.11823
2016-09-05,2016-09-05 15:00:01,1.11436,1.11823
2016-09-06,2016-09-06 15:00:00,1.12383,1.12557
2016-09-06,2016-09-06 15:00:00,1.12382,1.12557
2016-09-06,2016-09-06 15:00:00,1.12382,1.12557
2016-09-06,2016-09-06 15:00:00,1.12384,1.12557
2016-09-06,2016-09-06 15:00:00,1.12384,1.12557
2016-09-06,2016-09-06 15:00:00,1.12558,1.12557
2016-09-06,2016-09-06 15:00:01,1.12559,1.12557
df = data.frame(
day = c("2016-09-01", "2016-09-01", "2016-09-01", "2016-09-05", "2016-09-05",
"2016-09-05", "2016-09-05", "2016-09-06", "2016-09-06", "2016-09-06",
"2016-09-06", "2016-09-06", "2016-09-06", "2016-09-06"),
datetime = c("2016-09-01 15:00:00", "2016-09-01 15:00:00", "2016-09-01 15:00:01",
"2016-09-05 15:00:00", "2016-09-05 15:00:01", "2016-09-05 15:00:01",
"2016-09-05 15:00:01", "2016-09-06 15:00:00", "2016-09-06 15:00:00",
"2016-09-06 15:00:00", "2016-09-06 15:00:00", "2016-09-06 15:00:00",
"2016-09-06 15:00:00", "2016-09-06 15:00:01"),
price = c(1.11912, 1.13, 1.11911, 1.11436, 1.11436, 1.119, 1.11436,
1.12383, 1.12382, 1.12382, 1.12384, 1.12384, 1.12558, 1.12559),
dayhigh = c(1.1199, 1.1199, 1.1199, 1.11823, 1.11823, 1.11823, 1.11823,
1.12557, 1.12557, 1.12557, 1.12557, 1.12557, 1.12557, 1.12557)
)
我有一个想法是将白天的帧拆分成一个帧列表:
exlist <- split(example, as.Date(example$day))
这将返回一个对象列表。
我想要做的是在列表中的每个帧对象上使用which.max
,并在每天第一个高点发生的行的每个帧中的新列中添加一个“TRUE”。当天的第一个高点被定义为每天的第一个price > dayhigh
。
从那里我可以连接回一个框架并进行进一步的分析。
答案 0 :(得分:3)
无需完成所有工作,您可以使用data.table
一步完成:
library(data.table)
setDT(df)
df[ , first_high := (seq_len(.N) == which(price > dayhigh)[1]), by = day ]
df
# day datetime price dayhigh first_high
# 1: 2016-09-01 2016-09-01 15:00:00 1.11912 1.11990 FALSE
# 2: 2016-09-01 2016-09-01 15:00:00 1.13000 1.11990 TRUE
# 3: 2016-09-01 2016-09-01 15:00:01 1.11911 1.11990 FALSE
# 4: 2016-09-05 2016-09-05 15:00:00 1.11436 1.11823 FALSE
# 5: 2016-09-05 2016-09-05 15:00:01 1.11436 1.11823 FALSE
# 6: 2016-09-05 2016-09-05 15:00:01 1.11900 1.11823 TRUE
# 7: 2016-09-05 2016-09-05 15:00:01 1.11436 1.11823 FALSE
# 8: 2016-09-06 2016-09-06 15:00:00 1.12383 1.12557 FALSE
# 9: 2016-09-06 2016-09-06 15:00:00 1.12382 1.12557 FALSE
#10: 2016-09-06 2016-09-06 15:00:00 1.12382 1.12557 FALSE
#11: 2016-09-06 2016-09-06 15:00:00 1.12384 1.12557 FALSE
#12: 2016-09-06 2016-09-06 15:00:00 1.12384 1.12557 FALSE
#13: 2016-09-06 2016-09-06 15:00:00 1.12558 1.12557 TRUE
#14: 2016-09-06 2016-09-06 15:00:01 1.12559 1.12557 FALSE
答案 1 :(得分:3)
我有另一种基于data.table
的解决方案。
library(data.table)
setDT(example)
example[, first.high:= (.I == .I[which.max(price>dayhigh)]), by=day ]
答案 2 :(得分:1)
您可以使用ave
两次。
#1)确保price
大于dayhigh
#2)确保给定子组第一次发生。
ave(1:NROW(df), df$day, FUN = function(i) df$price[i] > df$dayhigh[i]) & #1
ave(1:NROW(df), df$day, FUN = function(i) cumsum(df$price[i] > df$dayhigh[i]) == 1) #2
#[1] FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE