计算停车区的入口,出口和停留的汽车

时间:2017-12-20 10:15:37

标签: r dplyr

我有这个数据集,包括客户ID,停车号,开始时间和持续时间。

require(lubridate)
set.seed(1234)
ex <- data.frame(
  client = sample(1:700, 500, replace = T),
  parking = sample(1:5, 1000, replace = T),
  start = sort(as.POSIXct(sample(1377100000:1377230000, 1000), origin = "1960-01-01")),
  duration = format(as.POSIXct(sample(1478130000:1478130000, 1000), origin = "1960-01-01 00:00:00"), "%M:%S")
)

我如何计算每个小时的每个停车位 - 有多少入口,出口和停车数量?

然而,我计算停止时间并添加新的向量,如下所示:

ex$intime <- format(ex$start, "%Y-%m-%d %H")
ex$outtime <- format(ex$stop, "%Y-%m-%d %H")
dates = seq(ymd_h(min(ex$intime)), ymd_h(max(ex$outtime)), by = "hour")

并计算现在停车场的车数:

counts = data.frame(date = dates,
                count = sapply(dates, function(x) sum(x <= ex$parkingStop & x >= ex$parkingStart))

请帮助我计算内脏,出口和停留的汽车。谢谢!

预期输出(仅举例):

 date                      entrances  exits  stayed
1  2017-04-08 00:00:00     0          0      10
2  2017-04-08 01:00:00    11          1      21
3  2017-04-08 02:00:00     8          4      25
4  2017-04-08 03:00:00     3          1      27
5  2017-04-08 04:00:00     2          7      22
6  2017-04-08 05:00:00     1          2      21
7  2017-04-08 06:00:00     2          9      14
8  2017-04-08 07:00:00     3          7      10
9  2017-04-08 08:00:00     3          12     1
10 2017-04-08 09:00:00    12          2      11
11 2017-04-08 10:00:00    33          23     21
12 2017-04-08 11:00:00    45          22     44

1 个答案:

答案 0 :(得分:2)

正如人们在评论中指出的那样,您的代码没有运行,但如果我理解正确,您正在尝试执行以下操作:

require(lubridate)
library(dplyr)

set.seed(1234)
ex <- tibble(
  client = sample(1:700, 1000, replace = T),
  parking = sample(1:5, 1000, replace = T),
  intime = sort(as.POSIXct(sample(1377100000:1377230000, 1000), origin = "1960-01-01")),
  # this will sample durations in M:S format
  dur = format(as.POSIXct(sample(1478130000:1478130000, 1000), origin = "1960-01-01 00:00:00"), "%M:%S")
) %>% mutate(outtime=intime + as.period(ms(dur), unit = "sec"))

ex %>% group_by(date_hours=floor_date(intime, "hour")) %>% 
  summarise(entrances = sum(floor_date(intime, "hour") == date_hours)) %>% 
  full_join(
    ex %>% group_by(date_hours=floor_date(outtime, "hour")) %>% 
      summarise(exits = sum(floor_date(outtime, "hour") == date_hours),
                cars=sum(intime<=date_hours))
  )

所以你得到了类似的表格:

Joining, by = "date_hours"
# A tibble: 38 x 4
            date_hours entrances exits  cars
                <dttm>     <int> <int> <int>
 1 2003-08-21 17:00:00         7    NA    NA
 2 2003-08-21 18:00:00        25    21     7
 3 2003-08-21 19:00:00        27    19    11
 4 2003-08-21 20:00:00        23    31    19
 5 2003-08-21 21:00:00        24    25    11
 6 2003-08-21 22:00:00        22    17    10
 7 2003-08-21 23:00:00        30    30    15
 8 2003-08-22 00:00:00        28    27    15
 9 2003-08-22 01:00:00        18    25    16
10 2003-08-22 02:00:00        34    28     9
# ... with 28 more rows