查找每个独特日期的值范围

时间:2013-09-01 10:43:39

标签: r range

我有以下示例:

Date1 <- seq(from = as.POSIXct("2010-05-01 02:00"), 
             to = as.POSIXct("2010-10-10 22:00"), by = 3600)
Dat <- data.frame(DateTime = Date1,
                  t = rnorm(length(Date1)))

我想在给定的日期找到值的范围(即最大值 - 最小值)。

首先,我已经定义了其他列,这些列定义了日期和日期(doy)方面的唯一日期。

Dat$date <- format(Dat$DateTime, format = "%Y-%m-%d") # find the unique days
Dat$doy <- as.numeric(format(Dat$DateTime, format="%j")) # find the unique days

然后找到我试过的范围

by(Dat$t, Dat$doy, function(x) range(x))

但是这会将范围返回为两个值而不是单个值,所以,我的问题是,如何找到每天的计算范围并将其返回到具有

的data.frame中
new_data <- data.frame(date = unique(Dat$date),
                       range = ...)

有人可以建议一种方法吗?

2 个答案:

答案 0 :(得分:2)

我倾向于使用tapply来做这种事情。 ave有时也很有用。这里:

> dr = tapply(Dat$t,Dat$doy,function(x){diff(range(x))})

经常检查狡猾的东西:

> dr[1]
     121 
3.084317 
> diff(range(Dat$t[Dat$doy==121]))
[1] 3.084317

使用names属性获取日期和值以生成数据框:

> new_data = data.frame(date=names(dr),range=dr)
> head(new_data)
    date    range
121  121 3.084317
122  122 4.204053

您是否要将年度数字转换回日期对象?

答案 1 :(得分:2)

# Use the data.table package
require(data.table)

# Set seed so data is reproducible 
set.seed(42)

# Create data.table
Date1 <- seq(from = as.POSIXct("2010-05-01 02:00"), to = as.POSIXct("2010-10-10 22:00"), by = 3600)
DT <- data.table(date = as.IDate(Date1), t = rnorm(length(Date1)))

# Set key on data.table so that it is sorted by date
setkey(DT, "date")

# Make a new data.table with the required information (can be used as a data.frame)
new_data <- DT[, diff(range(t)), by = date]

#            date       V1
# 1:   2010-05-01 4.943101
# 2:   2010-05-02 4.309401
# 3:   2010-05-03 4.568818
# 4:   2010-05-04 2.707036
# 5:   2010-05-05 4.362990
# ---                    
# 159: 2010-10-06 2.659115
# 160: 2010-10-07 5.820803
# 161: 2010-10-08 4.516654
# 162: 2010-10-09 4.010017
# 163: 2010-10-10 3.311408