Question

我在几天中每小时都无法计算变量“计数”的平均值。我有一个名为data的数据集，如下所示：

    Time                count
1   2019-06-30 05:00:00 17
2   2019-06-30 06:00:00 18
3   2019-06-30 07:00:00 26
4   2019-06-30 08:00:00 15
5   2019-07-01 00:00:00 13
6   2019-07-01 01:00:00 23
7   2019-07-01 02:00:00 13
8   2019-07-01 03:00:00 22

它包含几天中每小时的值。现在我想为每个小时计算一个值，这是所有天中该小时的平均值。像这样：

    Time        count
1   00:00       22
2   01:00       13
3   02:00       11
4   03:00       9

我是R的新手，只涉及到计算每日平均值：

DF2 <- data.frame(data, Day = as.Date(format(data$Time)))
aggregate(cbind(count) ~ Day, DF2, mean)

    Time        count
1   2019-06-30  22
2   2019-07-01  13
3   2019-07-02  11
4   2019-07-03  9

但是我无法将其与每小时平均值一起使用。我试图在其他帖子中找到解决方案，但是他们没有用，或者似乎需要大量独特的计算。在R中必须有一种简单的方法。

这是dput（droplevels（head（data，4）））的输出：

structure(list(Time = structure(1:4, .Label = c("2019-06-30 05:00:00", 
"2019-06-30 06:00:00", "2019-06-30 07:00:00", "2019-06-30 08:00:00"
), class = "factor"), count = c(17L, 18L, 26L, 15L)), row.names = c(NA, 
4L), class = "data.frame")

有什么建议吗？预先谢谢你！

马克西

Answer 1

只需花几个小时America/Phoenix，然后在它们上substring。

aggregate

或使用d$hour <- substring(d$time, 12) d.2 <- aggregate(count ~ substring(d$time, 12), d, mean) head(d.2) # hour count # 1 00:00:00 35.00 # 2 01:00:00 73.50 # 3 02:00:00 45.50 # 4 03:00:00 61.75 # 5 04:00:00 65.25 # 6 05:00:00 40.00来获取每小时平均值作为新列。

ave

数据

d <- transform(d, h.average=ave(count, substring(time, 12)))
head(d)
#                  time count h.average
# 1 2019-06-30 00:00:00    40    35.00
# 2 2019-06-30 01:00:00    67    73.50
# 3 2019-06-30 02:00:00    34    45.50
# 4 2019-06-30 03:00:00    49    61.75
# 5 2019-06-30 04:00:00    67    65.25
# 6 2019-06-30 05:00:00    43    40.00

Answer 2

使用lubridate和dplyr：按小时的时间值分组

生成数据

library(dplyr)
library(lubridate)

df <- data.frame(Time=seq(as.POSIXct('2019-06-30 00:00:00'), as.POSIXct('2019-07-03 23:00:00'), by=3600),
  count = floor(runif(96, 12,71))
)

按小时价格分组，平均且美观

df %>% mutate(hour = lubridate::hour(Time)) %>%
  group_by(hour) %>% summarise(count=mean(count)) %>%
  # pretty print
  mutate(hour = sprintf("%02d:00", hour)) %>%
  print(n=24)

如何计算R

2 个答案:

数据