R每小时组数据帧

时间:2013-07-05 10:55:34

标签: r statistics dataframe

我有一个数据框,其日期列中包含DateTime值,三列包含每个日期时间的计数。

我正在尝试按小时计算三列数据

Sample Data

聚合函数适用于单个列,但我正在尝试为整个数据框执行此操作。有什么提示吗?

aggregate(DateFreq$ColA,by=list((substr(DateFreq$Date,1,13))),sum) 

2 个答案:

答案 0 :(得分:5)

您可以使用dplyr使用dplyr::group_bydplyr::summarise进行汇总:

library(lubridate)
library(anytime)
library(tidyverse)

Lines <- "Date,c1,c2,c3
06/25/2013 12:01,0,1,1
06/25/2013 12:08,-1,1,1
06/25/2013 12:48,0,1,1
06/25/2013 12:58,0,1,1
06/25/2013 13:01,0,1,1
06/25/2013 13:08,0,1,1
06/25/2013 13:48,0,1,1
06/25/2013 13:58,0,1,1
06/25/2013 14:01,0,1,1
06/25/2013 14:08,0,1,1
06/25/2013 14:48,0,1,1
06/25/2013 14:58,0,1,1"

setClass("myDate")
setAs("character","myDate", function(from) anytime(from))
df <- read.csv(text = Lines, header=TRUE,  colClasses = c("myDate", "numeric", "numeric", "numeric"))

df %>%
  group_by(Date=floor_date(Date, "1 hour")) %>%
  summarize(c1=sum(c1), c2=sum(c2), c3=sum(c3))
# A tibble: 3 × 4
                 Date    c1    c2    c3
               <dttm> <dbl> <dbl> <dbl>
1 2013-06-25 12:00:00    -1     4     4
2 2013-06-25 13:00:00     0     4     4
3 2013-06-25 14:00:00     0     4     4

答案 1 :(得分:3)

您可以使用formula的{​​{1}}。但是你应该在之前正确创建一个aggregate变量。

hour

这里有一个例子:

dat$hour <- as.POSIXlt(dat$Date)$hour
aggregate(.~hour,data=dat,sum)