基于日期范围的聚合

时间:2016-10-06 07:56:55

标签: r

我有以下数据框,其中包含日期和值列:

DF2 <- data.frame("Date"=c("2016-09-01","2016-09-02","2016-09-03","2016-09-05","2016-09-06"),
              "Value1"=c(20,200,60,150,140),
              "Value2"=c(15,20,15,30,30),
              "Value3"=c(80,42,29,40,39))

然后我有两个日期输入参数:

dateFrom <- "2016-09-02"
dateTo <- "2016-09-05"

如何根据此日期范围汇总每个数字列(Value1 - Value3)?我想计算简单和作为聚合标准。非常感谢你的前瞻性建议。

5 个答案:

答案 0 :(得分:1)

应该有效。数据应按日期排序。

DF2 <- data.frame("Date"=as.Date(c("2016-09-01","2016-09-02","2016-09-03","2016-09-05","2016-09-06")),
              "Value1"=c(20,200,60,150,140),
              "Value2"=c(15,20,15,30,30),
              "Value3"=c(80,42,29,40,39))
dateFrom <- as.Date("2016-09-02")
dateTo <- as.Date("2016-09-05")
start <- which(DF2$Date == dateFrom)
end <- which(DF2$Date == dateTo)
lapply(DF2[start:end,2:4],sum)

答案 1 :(得分:0)

数据:

DF2 <- data.frame("Date"=as.Date(c("2016-09-01","2016-09-02","2016-09-03","2016-09-05","2016-09-06"),format = "%Y-%m-%d"),
                  "Value1"=c(20,200,60,150,140),
                  "Value2"=c(15,20,15,30,30),
                  "Value3"=c(80,42,29,40,39))
dateFrom <- as.Date("2016-09-02",format = "%Y-%m-%d")
dateTo <- as.Date("2016-09-05",format = "%Y-%m-%d")

使用dplyr

library(dplyr)    
DF2%>%filter(Date<=dateTo&Date>=dateFrom)%>%select(-Date)%>%colSums()
    Value1 Value2 Value3 
       410     65    111 

编辑:我直接在DF2中更改日期类型(转换为日期格式)。如果不这样做,你必须这样做:

DF2 %>% transform(Date = as.Date(Date, format = "%Y-%m-%d"))%>%filter(Date<=dateTo&Date>=dateFrom)%>%select(-Date)%>%colSums()

答案 2 :(得分:0)

这是你想要的吗?

df$Date <- as.Date(df$Date)
r <- df[(df$Date >= dateFrom & df$Date <= dateTo),]
data.frame(Date=r$Date, Sum=rowSums(r[-1]))

#        Date Sum
#2 2016-09-02 262
#3 2016-09-03 104
#4 2016-09-05 220

数据

df <- structure(list(Date = c("2016-09-01", "2016-09-02", "2016-09-03", 
"2016-09-05", "2016-09-06"), Value1 = c(20, 200, 60, 150, 140
), Value2 = c(15, 20, 15, 30, 30), Value3 = c(80, 42, 29, 40, 
39)), .Names = c("Date", "Value1", "Value2", "Value3"), row.names = c(NA, 
-5L), class = "data.frame")

答案 3 :(得分:0)

我认为这就是你想要的(把你的日期字段作为字符而不是因素):

DF2 <- data.frame("Date"=c("2016-09-01","2016-09-02","2016-09-03","2016-09-05","2016-09-06"),
                  "Value1"=c(20,200,60,150,140),
                  "Value2"=c(15,20,15,30,30),
                  "Value3"=c(80,42,29,40,39), stringsAsFactors = FALSE)

dateFrom <- "2016-09-02"
dateTo <- "2016-09-05"
apply(subset(DF2, Date >= dateFrom & Date <= dateTo)[2:4], 2, sum)
Value1 Value2 Value3 
   410     65    111

答案 4 :(得分:0)

这是我简单的lubdridate解决方案:

library(lubridate)
interval <- interval(dateFrom, dateTo)

criteria <- ymd(DF2$Date) %within% interval


rowSums(DF2[criteria,2:4])
#  2   3   4 
#262 104 220 

colSums(DF2[criteria,2:4])
# Value1 Value2 Value3 
#    410     65    111 

我不知道您是否希望获得行(rowSums)或列(colSums)的总和,您只需要更改最后一行代码。