如何计算滞后两列的时差

时间:2020-10-18 09:07:14

标签: r dplyr lubridate difftime

我目前正在面对纽约市一名驾驶员的出租车旅行数据集。我得到了驾驶员ID以及每次旅行的接送日期和时间以及下车日期和时间。现在,我要计算上次旅行的下车时间与新旅行的上车时间之间的等待时间。因此,我必须按驾驶员ID分组来计算两列之间的时差(因为下车时间是指最后一次旅行和到下一次旅行的上车时间(下一列))(以确保我没有计算时差)在两个不同的驾驶员之间旅行。

可能的数据集如下:

hack_license = c("303F79923DA5DA7A10DF15E2D91CDCF7","697ABFCDF7E7C77A01183C857132F2A4","697ABFCDF7E7C77A01183C857132F2A4","697ABFCDF7E7C77A01183C857132F2A4","ABE23CA71E2DE84972281BA1C70B6EBB","ABE23CA71E2DE84972281BA1C70B6EBB","BA83D7C383EAA4F9D78A1A8B83CB3E92","BA83D7C383EAA4F9D78A1A8B83CB3E92","D476A1872F1F6594BD638C274483ED06","D476A1872F1F6594BD638C274483ED06")

pickup_datetime = c("2013-12-31 23:01:07","2013-12-31 23:04:00","2013-12-31 23:31:00","2013-12-31 23:40:00","2013-12-31 23:16:39","2013-12-31 23:24:05","2013-12-31 23:09:10","2013-12-31 23:26:26","2013-12-31 23:13:00","2013-12-31 23:22:00")

dropoff_datetime = c("2013-12-31 23:20:33","2013-12-31 23:28:00","2013-12-31 23:33:00","2013-12-31 23:48:00","2013-12-31 23:22:29","2013-12-31 23:28:37","23:21:24","2013-12-31 23:36:54","2013-12-31 23:20:00","2013-12-31 23:27:00")

data <- data.frame(hack_license,pickup_datetime,dropoff_datetime)

我试图像这样使用dplyr和lubridate,但是它不起作用。

data %>%
group_by(data$hack_license) %>%
  group_by(hack_license) %>%
  mutate(waiting_time_in_secs = difftime(pickup_datetime,                                       
lag(dropoff_datetime), units = 'secs'))

也许有些人可以在这里帮助我。太好了!

1 个答案:

答案 0 :(得分:0)

您可以为上车和下车都创建一个datetime列,并为每个hack_license计算当前上车时间和上一个下车时间之间的时间差。

library(dplyr)
library(lubridate)

data <- data %>%
          mutate(pickup_datetime = ymd_hms(pickup_datetime), 
                 dropoff_datetime = ymd_hms(dropoff_datetime)) %>%
           group_by(hack_license) %>%
           mutate(waiting_time_in_secs = as.numeric(difftime(pickup_datetime, 
                                lag(dropoff_datetime), units = 'secs')))
data
#   hack_license                     pickup_datetime     dropoff_datetime    waiting_time_in_secs
#   <chr>                            <dttm>              <dttm>                             <dbl>
# 1 303F79923DA5DA7A10DF15E2D91CDCF7 2013-12-31 23:01:07 2013-12-31 23:20:33                   NA
# 2 697ABFCDF7E7C77A01183C857132F2A4 2013-12-31 23:04:00 2013-12-31 23:28:00                   NA
# 3 697ABFCDF7E7C77A01183C857132F2A4 2013-12-31 23:31:00 2013-12-31 23:33:00                  180
# 4 697ABFCDF7E7C77A01183C857132F2A4 2013-12-31 23:40:00 2013-12-31 23:48:00                  420
# 5 ABE23CA71E2DE84972281BA1C70B6EBB 2013-12-31 23:16:39 2013-12-31 23:22:29                   NA
# 6 ABE23CA71E2DE84972281BA1C70B6EBB 2013-12-31 23:24:05 2013-12-31 23:28:37                   96
# 7 BA83D7C383EAA4F9D78A1A8B83CB3E92 2013-12-31 23:09:10 2013-12-31 23:21:24                   NA
# 8 BA83D7C383EAA4F9D78A1A8B83CB3E92 2013-12-31 23:26:26 2013-12-31 23:36:54                  302
# 9 D476A1872F1F6594BD638C274483ED06 2013-12-31 23:13:00 2013-12-31 23:20:00                   NA
#10 D476A1872F1F6594BD638C274483ED06 2013-12-31 23:22:00 2013-12-31 23:27:00                  120