按照R

时间:2016-01-17 20:12:53

标签: r aggregate intervals

我有以下数据 - 它是一个citibike站的自行车可用性的聚合json馈送,每隔10分钟拍摄一个月。我已成功地将日期与使用时间分开 data$time<- format(data$created_at,"%H:%M:%S")但不确定如何从每个区间开始进行摘要统计,而不需要分别对每个区间进行耗时的聚合。


> head(data)
              id station_id status available_bike_count available_dock_count
28847606 3801486        295 Active                   33                   34
28847938 3801818        295 Active                   37                   30
28848270 3802150        295 Active                   34                   33
28848602 3802482        295 Active                   36                   31
28848934 3802814        295 Active                   33                   34
28849266 3803146        295 Active                   32                   35
                  created_at station_summary_id month year     day
28847606 2013-10-01 00:00:00              11641    10 2013 Tuesday
28847938 2013-10-01 00:10:00              11642    10 2013 Tuesday
28848270 2013-10-01 00:20:00              11643    10 2013 Tuesday
28848602 2013-10-01 00:30:00              11644    10 2013 Tuesday
28848934 2013-10-01 00:40:00              11645    10 2013 Tuesday
28849266 2013-10-01 00:50:00              11646    10 2013 Tuesday
> dim(data)
[1] 4267   10

输出应该是一个data.frame,每个间隔有144行,平均值,中位数和模式为available_bike_count 3个额外列,但只应包括工作日(周一至周五)的间隔。 data.frame应如下所示:

head(output)
  interval     mean    median      mode
1 00:00:00 11.19383 11.729039  6.832517
2 00:10:00 10.94193  9.873530  9.161699
3 00:20:00 19.58576  5.386799  1.271674
4 00:30:00 10.06028 15.003440  6.306354
5 00:40:00 11.33270  2.041195 12.506719   

0 个答案:

没有答案