我有一个包含持续时间(以秒为单位,vec1)和一个组变量(vec2)的数据集:
df=data.frame(vec1=c(180,7560,16020,300,8940,15120,9600,300,2580,25860,13200,6900,22380,11460,15480),
vec2=c("a","a","a","a","a","a","b","b","b","b","b","b","b","b","b"))
我想计算每组的平均持续时间。我测试了两种方法:
test1=plyr::ddply(df, .(vec2), function(x) mean(x$vec1))
test1$V1=as.POSIXct(test1$V1, origin="1970-01-01", tz = "GMT")
test2=plyr::ddply(df, .(vec2), function(x) as.POSIXct(mean(x$vec1), origin="1970-01-01", tz = "GMT"))
结果有所不同(test2多一个小时,test1给出了正确答案):
test1 vec2 V1 1 a 1970-01-01 02:13:40 2 b 1970-01-01 03:19:33
test2 vec2 V1 1 a 1970-01-01 03:13:40 2 b 1970-01-01 04:19:33
我知道哪个代码可以给我正确的答案,但是我想了解为什么结果不同。你有解释吗? 谢谢!