我正在尝试通过因子变量对一组POSIXct对象求和,但是收到的错误是sum
没有为POSIXt对象定义。但是,如果我只计算平均值,它就可以正常工作。但是如何使用tapply获取组的总时间?
示例:
data <- data.frame(time = c("2:50:04", "1:24:10", "3:10:43", "1:44:26", "2:10:19", "3:01:04"),
group = c("A","A","A","B","B","B"))
data$group <- as.factor(data$group)
data$time <- as.POSIXct(paste("1970-01-01", data$time), format="%Y-%m-%d %H:%M:%S", tz="GMT")
# works
tapply(data$time, data$group, mean)
# doesn't work
tapply(data$time, data$group, sum)
答案 0 :(得分:1)
日期对象无法求和,这在语义上没有意义,+
运算符也没有为POSIXct对象定义。
可能你想模拟时差并总结它们?
尝试:
times <- as.difftime(c("2:50:04", "1:24:10", "3:10:43",
"1:44:26", "2:10:19", "3:01:04"), "%H:%M:%S")
sum(times)
difftime
对象也是你在减去两个日期对象时得到的(在语义上是合理的)。
修改强>
以语义上更整洁的方式解决OP问题的整个解决方案(tapply接缝破坏difftime类的结构 - 改为使用dplyr包中的group_by
)
library(dplyr)
times <- as.difftime(c("2:50:04", "1:24:10", "3:10:43",
"1:44:26", "2:10:19", "3:01:04"), format="%H:%M:%S")
data <- data.frame(time = times, group = c("A","A","A","B","B","B"))
summarise(group_by(data, group), sum(time))
这给出了以下输出:
Source: local data frame [2 x 2]
group sum(time)
1 A 7.415833 hours
2 B 6.930278 hours