我有以下示例:
Date1 <- seq(from = as.POSIXct("2010-05-01 02:00"),
to = as.POSIXct("2010-10-10 22:00"), by = 3600)
Dat <- data.frame(DateTime = Date1,
t = rnorm(length(Date1)))
我想在给定的日期找到值的范围(即最大值 - 最小值)。
首先,我已经定义了其他列,这些列定义了日期和日期(doy)方面的唯一日期。
Dat$date <- format(Dat$DateTime, format = "%Y-%m-%d") # find the unique days
Dat$doy <- as.numeric(format(Dat$DateTime, format="%j")) # find the unique days
然后找到我试过的范围
by(Dat$t, Dat$doy, function(x) range(x))
但是这会将范围返回为两个值而不是单个值,所以,我的问题是,如何找到每天的计算范围并将其返回到具有
的data.frame中new_data <- data.frame(date = unique(Dat$date),
range = ...)
有人可以建议一种方法吗?
答案 0 :(得分:2)
我倾向于使用tapply
来做这种事情。 ave
有时也很有用。这里:
> dr = tapply(Dat$t,Dat$doy,function(x){diff(range(x))})
经常检查狡猾的东西:
> dr[1]
121
3.084317
> diff(range(Dat$t[Dat$doy==121]))
[1] 3.084317
使用names属性获取日期和值以生成数据框:
> new_data = data.frame(date=names(dr),range=dr)
> head(new_data)
date range
121 121 3.084317
122 122 4.204053
您是否要将年度数字转换回日期对象?
答案 1 :(得分:2)
# Use the data.table package
require(data.table)
# Set seed so data is reproducible
set.seed(42)
# Create data.table
Date1 <- seq(from = as.POSIXct("2010-05-01 02:00"), to = as.POSIXct("2010-10-10 22:00"), by = 3600)
DT <- data.table(date = as.IDate(Date1), t = rnorm(length(Date1)))
# Set key on data.table so that it is sorted by date
setkey(DT, "date")
# Make a new data.table with the required information (can be used as a data.frame)
new_data <- DT[, diff(range(t)), by = date]
# date V1
# 1: 2010-05-01 4.943101
# 2: 2010-05-02 4.309401
# 3: 2010-05-03 4.568818
# 4: 2010-05-04 2.707036
# 5: 2010-05-05 4.362990
# ---
# 159: 2010-10-06 2.659115
# 160: 2010-10-07 5.820803
# 161: 2010-10-08 4.516654
# 162: 2010-10-09 4.010017
# 163: 2010-10-10 3.311408