我有以下数据 - 它是一个citibike站的自行车可用性的聚合json馈送,每隔10分钟拍摄一个月。我已成功地将日期与使用时间分开
data$time<- format(data$created_at,"%H:%M:%S")
但不确定如何从每个区间开始进行摘要统计,而不需要分别对每个区间进行耗时的聚合。
> head(data) id station_id status available_bike_count available_dock_count 28847606 3801486 295 Active 33 34 28847938 3801818 295 Active 37 30 28848270 3802150 295 Active 34 33 28848602 3802482 295 Active 36 31 28848934 3802814 295 Active 33 34 28849266 3803146 295 Active 32 35 created_at station_summary_id month year day 28847606 2013-10-01 00:00:00 11641 10 2013 Tuesday 28847938 2013-10-01 00:10:00 11642 10 2013 Tuesday 28848270 2013-10-01 00:20:00 11643 10 2013 Tuesday 28848602 2013-10-01 00:30:00 11644 10 2013 Tuesday 28848934 2013-10-01 00:40:00 11645 10 2013 Tuesday 28849266 2013-10-01 00:50:00 11646 10 2013 Tuesday > dim(data) [1] 4267 10
输出应该是一个data.frame,每个间隔有144行,平均值,中位数和模式为available_bike_count
3个额外列,但只应包括工作日(周一至周五)的间隔。 data.frame应如下所示:
head(output) interval mean median mode 1 00:00:00 11.19383 11.729039 6.832517 2 00:10:00 10.94193 9.873530 9.161699 3 00:20:00 19.58576 5.386799 1.271674 4 00:30:00 10.06028 15.003440 6.306354 5 00:40:00 11.33270 2.041195 12.506719