我有这个数据集,包括客户ID,停车号,开始时间和持续时间。
require(lubridate)
set.seed(1234)
ex <- data.frame(
client = sample(1:700, 500, replace = T),
parking = sample(1:5, 1000, replace = T),
start = sort(as.POSIXct(sample(1377100000:1377230000, 1000), origin = "1960-01-01")),
duration = format(as.POSIXct(sample(1478130000:1478130000, 1000), origin = "1960-01-01 00:00:00"), "%M:%S")
)
我如何计算每个小时的每个停车位 - 有多少入口,出口和停车数量?
然而,我计算停止时间并添加新的向量,如下所示:
ex$intime <- format(ex$start, "%Y-%m-%d %H")
ex$outtime <- format(ex$stop, "%Y-%m-%d %H")
dates = seq(ymd_h(min(ex$intime)), ymd_h(max(ex$outtime)), by = "hour")
并计算现在停车场的车数:
counts = data.frame(date = dates,
count = sapply(dates, function(x) sum(x <= ex$parkingStop & x >= ex$parkingStart))
请帮助我计算内脏,出口和停留的汽车。谢谢!
预期输出(仅举例):
date entrances exits stayed
1 2017-04-08 00:00:00 0 0 10
2 2017-04-08 01:00:00 11 1 21
3 2017-04-08 02:00:00 8 4 25
4 2017-04-08 03:00:00 3 1 27
5 2017-04-08 04:00:00 2 7 22
6 2017-04-08 05:00:00 1 2 21
7 2017-04-08 06:00:00 2 9 14
8 2017-04-08 07:00:00 3 7 10
9 2017-04-08 08:00:00 3 12 1
10 2017-04-08 09:00:00 12 2 11
11 2017-04-08 10:00:00 33 23 21
12 2017-04-08 11:00:00 45 22 44
答案 0 :(得分:2)
正如人们在评论中指出的那样,您的代码没有运行,但如果我理解正确,您正在尝试执行以下操作:
require(lubridate)
library(dplyr)
set.seed(1234)
ex <- tibble(
client = sample(1:700, 1000, replace = T),
parking = sample(1:5, 1000, replace = T),
intime = sort(as.POSIXct(sample(1377100000:1377230000, 1000), origin = "1960-01-01")),
# this will sample durations in M:S format
dur = format(as.POSIXct(sample(1478130000:1478130000, 1000), origin = "1960-01-01 00:00:00"), "%M:%S")
) %>% mutate(outtime=intime + as.period(ms(dur), unit = "sec"))
ex %>% group_by(date_hours=floor_date(intime, "hour")) %>%
summarise(entrances = sum(floor_date(intime, "hour") == date_hours)) %>%
full_join(
ex %>% group_by(date_hours=floor_date(outtime, "hour")) %>%
summarise(exits = sum(floor_date(outtime, "hour") == date_hours),
cars=sum(intime<=date_hours))
)
所以你得到了类似的表格:
Joining, by = "date_hours"
# A tibble: 38 x 4
date_hours entrances exits cars
<dttm> <int> <int> <int>
1 2003-08-21 17:00:00 7 NA NA
2 2003-08-21 18:00:00 25 21 7
3 2003-08-21 19:00:00 27 19 11
4 2003-08-21 20:00:00 23 31 19
5 2003-08-21 21:00:00 24 25 11
6 2003-08-21 22:00:00 22 17 10
7 2003-08-21 23:00:00 30 30 15
8 2003-08-22 00:00:00 28 27 15
9 2003-08-22 01:00:00 18 25 16
10 2003-08-22 02:00:00 34 28 9
# ... with 28 more rows