我目前正在面对纽约市一名驾驶员的出租车旅行数据集。我得到了驾驶员ID以及每次旅行的接送日期和时间以及下车日期和时间。现在,我要计算上次旅行的下车时间与新旅行的上车时间之间的等待时间。因此,我必须按驾驶员ID分组来计算两列之间的时差(因为下车时间是指最后一次旅行和到下一次旅行的上车时间(下一列))(以确保我没有计算时差)在两个不同的驾驶员之间旅行。
可能的数据集如下:
hack_license = c("303F79923DA5DA7A10DF15E2D91CDCF7","697ABFCDF7E7C77A01183C857132F2A4","697ABFCDF7E7C77A01183C857132F2A4","697ABFCDF7E7C77A01183C857132F2A4","ABE23CA71E2DE84972281BA1C70B6EBB","ABE23CA71E2DE84972281BA1C70B6EBB","BA83D7C383EAA4F9D78A1A8B83CB3E92","BA83D7C383EAA4F9D78A1A8B83CB3E92","D476A1872F1F6594BD638C274483ED06","D476A1872F1F6594BD638C274483ED06")
pickup_datetime = c("2013-12-31 23:01:07","2013-12-31 23:04:00","2013-12-31 23:31:00","2013-12-31 23:40:00","2013-12-31 23:16:39","2013-12-31 23:24:05","2013-12-31 23:09:10","2013-12-31 23:26:26","2013-12-31 23:13:00","2013-12-31 23:22:00")
dropoff_datetime = c("2013-12-31 23:20:33","2013-12-31 23:28:00","2013-12-31 23:33:00","2013-12-31 23:48:00","2013-12-31 23:22:29","2013-12-31 23:28:37","23:21:24","2013-12-31 23:36:54","2013-12-31 23:20:00","2013-12-31 23:27:00")
data <- data.frame(hack_license,pickup_datetime,dropoff_datetime)
我试图像这样使用dplyr和lubridate,但是它不起作用。
data %>%
group_by(data$hack_license) %>%
group_by(hack_license) %>%
mutate(waiting_time_in_secs = difftime(pickup_datetime,
lag(dropoff_datetime), units = 'secs'))
也许有些人可以在这里帮助我。太好了!
答案 0 :(得分:0)
您可以为上车和下车都创建一个datetime列,并为每个hack_license
计算当前上车时间和上一个下车时间之间的时间差。
library(dplyr)
library(lubridate)
data <- data %>%
mutate(pickup_datetime = ymd_hms(pickup_datetime),
dropoff_datetime = ymd_hms(dropoff_datetime)) %>%
group_by(hack_license) %>%
mutate(waiting_time_in_secs = as.numeric(difftime(pickup_datetime,
lag(dropoff_datetime), units = 'secs')))
data
# hack_license pickup_datetime dropoff_datetime waiting_time_in_secs
# <chr> <dttm> <dttm> <dbl>
# 1 303F79923DA5DA7A10DF15E2D91CDCF7 2013-12-31 23:01:07 2013-12-31 23:20:33 NA
# 2 697ABFCDF7E7C77A01183C857132F2A4 2013-12-31 23:04:00 2013-12-31 23:28:00 NA
# 3 697ABFCDF7E7C77A01183C857132F2A4 2013-12-31 23:31:00 2013-12-31 23:33:00 180
# 4 697ABFCDF7E7C77A01183C857132F2A4 2013-12-31 23:40:00 2013-12-31 23:48:00 420
# 5 ABE23CA71E2DE84972281BA1C70B6EBB 2013-12-31 23:16:39 2013-12-31 23:22:29 NA
# 6 ABE23CA71E2DE84972281BA1C70B6EBB 2013-12-31 23:24:05 2013-12-31 23:28:37 96
# 7 BA83D7C383EAA4F9D78A1A8B83CB3E92 2013-12-31 23:09:10 2013-12-31 23:21:24 NA
# 8 BA83D7C383EAA4F9D78A1A8B83CB3E92 2013-12-31 23:26:26 2013-12-31 23:36:54 302
# 9 D476A1872F1F6594BD638C274483ED06 2013-12-31 23:13:00 2013-12-31 23:20:00 NA
#10 D476A1872F1F6594BD638C274483ED06 2013-12-31 23:22:00 2013-12-31 23:27:00 120