在一些不等时期/时间序列中提取最高值

时间:2016-10-31 07:24:28

标签: r date dataframe overlap

我有两个数据框:period_example(由BegEnd组成)和price_example(由DateHigh组成)。我希望每个Beg-End期间的最高值为High。怎么做?谢谢。

以下是数据:

period_example <- data.frame(Beg = as.Date(c("2000-01-01","2000-01-04","2000-01-09")),
                             End = as.Date(c("2000-01-03","2000-01-08","2000-01-12")))
price_example <- data.frame(Date = seq(as.Date("2000-01-01"), as.Date("2000-01-12"), by="days"), 
                            High = c(100,105,104,103,102,106,107,108,109,110,115,114))

结果应该是这样的:

result <- data.frame(Beg = as.Date(c("2000-01-01","2000-01-04","2000-01-09")),
                     End = as.Date(c("2000-01-03","2000-01-08","2000-01-12")),
                     High = c(105,108,115))

3 个答案:

答案 0 :(得分:2)

我认为我找到了解决此问题的方法,您可以将函数应用于每一行,并在其他数据框中找到这些日期之间的最大值:

period_example <- data.frame(Beg = as.Date(c("2000-01-01","2000-01-04","2000-01-09")),End = as.Date(c("2000-01-03","2000-01-08","2000-01-12")))
price_example <- data.frame(Date = seq(as.Date("2000-01-01"), as.Date("2000-01-12"),by="days"), High = c(100,105,104,103,102,106,107,108,109,110,115,114))

period_example$High <- apply(period_example,1 , function(x) max(price_example[price_example$Date >= x[1] & price_example$Date <= x[2], "High"]))
> period_example
         Beg        End High
1 2000-01-01 2000-01-03  105
2 2000-01-04 2000-01-08  108
3 2000-01-09 2000-01-12  115

答案 1 :(得分:2)

data.table具有快速功能:foverlaps

library(data.table)

x = setDT(period_example)
y = setDT(price_example)

y[, `:=` (Beg = Date, End = Date)]

setkey(x, Beg, End)
z = foverlaps(y, x)

z[, .(High = max(High)), by = .(Beg, End)]

答案 2 :(得分:0)

这应该有效

period_example <- data.frame(Beg = as.Date(c("2000-01-01","2000-01-04","2000-01-09")),End = as.Date(c("2000-01-03","2000-01-08","2000-01-12")))

price_example <- data.frame(Date = seq(as.Date("2000-01-01"), as.Date("2000-01-12"),by="days"), High = c(100,105,104,103,102,106,107,108,109,110,115,114))


betweenDates <- function(target,beg,end){
  beg <- as.Date(beg)
  end <- as.Date(end)
  target <- as.Date(target)
  return(target>=beg&target<=end)
}

selecteDates <- sapply(price_example$Date,function(x) betweenDates(x,period_example$Beg,period_example$End))


highValues <- sapply(1:nrow(period_example),function(x) max(price_example$High[selecteDates[x,]]))


result <- data.frame(period_example,High=highValues)