我在data.frame中有一个不规则间隔的时间序列。 如何在每个事件中获得一行,以获得每个事件的最大值? (不仅仅是每个事件的最大值。)
如果事件间隔超过一定时间(例如三天),则事件被定义为不同。 以下是一些虚假数据:
set.seed(42)
x <- data.frame(date=as.Date("2017-08-01")+cumsum(ceiling(rexp(200, rate=0.2))),
value=round(cumsum(rnorm(200, sd=8)))+500)
plot(x, type="o", pch=16, cex=0.6, las=1)
head(x, 20)
答案 0 :(得分:0)
# Time differences between observations:
x$diff <- c(0, as.numeric(diff(x$date)) )
# distinct event if more than 3 days apart:
x$event <- cumsum(x$diff>3)
# simply get maximum value per event:
tapply(x$value, x$event, max, na.rm=TRUE)
# Get one observation row per event (the maximum):
x$max <- unlist(tapply(x$value, x$event, FUN=function(v){
out <- rep(0, length(v))
out[which.max(v)] <- 1 # select first maximum value if there are ties
out
}))
head(x, 20)
# independent event maxima rows:
x[x$max==1, 1:2]
可以像这样获得每小时的时差:
diffs <- as.difftime(diff(x$date))
units(diffs) <- "hours"
diffs <- as.numeric(diffs)