将在同一变量中以两种不同格式收集的日期/时间转换为一致的格式

时间:2018-01-28 12:37:57

标签: r date datetime posixct

     Request.id Pickup.point Driver.id            Status   Request.timestamp      Drop.timestamp
1           619      Airport         1    Trip Completed     11/7/2016 11:51     11/7/2016 13:00
2           867      Airport         1    Trip Completed     11/7/2016 17:57     11/7/2016 18:47
3          1807         City         1    Trip Completed      12/7/2016 9:17      12/7/2016 9:58
4          2532      Airport         1    Trip Completed     12/7/2016 21:08     12/7/2016 22:03
5          3112         City         1    Trip Completed 13-07-2016 08:33:16 13-07-2016 09:25:47
6          3879      Airport         1    Trip Completed 13-07-2016 21:57:28 13-07-2016 22:28:59
7          4270      Airport         1    Trip Completed 14-07-2016 06:15:32 14-07-2016 07:13:15
8          5510      Airport         1    Trip Completed 15-07-2016 05:11:52 15-07-2016 06:07:52
9          6248         City         1    Trip Completed 15-07-2016 17:57:27 15-07-2016 18:50:51
10          267         City         2    Trip Completed      11/7/2016 6:46      11/7/2016 7:25
11         1467      Airport         2    Trip Completed      12/7/2016 5:08      12/7/2016 6:02
12         1983         City         2    Trip Completed     12/7/2016 12:30     12/7/2016 12:57
13         2784      Airport         2    Trip Completed 13-07-2016 04:49:20 13-07-2016 05:23:03
14         3075         City         2    Trip Completed 13-07-2016 08:02:53 13-07-2016 09:16:19
15         3379         City         2    Trip Completed 13-07-2016 14:23:02 13-07-2016 15:35:18
16         3482      Airport         2    Trip Completed 13-07-2016 17:23:18 13-07-2016 18:20:51
17         4652         City         2    Trip Completed 14-07-2016 12:01:02 14-07-2016 12:36:46
18         5335      Airport         2    Trip Completed 14-07-2016 22:24:13 14-07-2016 23:18:52
19          535      Airport         3    Trip Completed     11/7/2016 10:00     11/7/2016 10:31
20          960      Airport         3    Trip Completed     11/7/2016 18:45     11/7/2016 19:23
21         1934      Airport         3    Trip Completed     12/7/2016 11:17     12/7/2016 12:23
22         2083      Airport         3    Trip Completed     12/7/2016 15:46     12/7/2016 16:40
23         2211      Airport         3    Trip Completed     12/7/2016 18:00     12/7/2016 18:28
24         3096      Airport         3    Trip Completed 13-07-2016 08:17:29 13-07-2016 09:22:37
25         3881      Airport         3    Trip Completed 13-07-2016 21:54:18 13-07-2016 22:51:23
26         5254         City         3    Trip Completed 14-07-2016 21:23:03 14-07-2016 22:25:19
27         5434         City         3    Trip Completed 15-07-2016 02:41:38 15-07-2016 03:24:43
28         5916         City         3    Trip Completed 15-07-2016 10:00:43 15-07-2016 10:53:06
29          669         City         4    Trip Completed     11/7/2016 13:08     11/7/2016 13:49
30         1567      Airport         4    Trip Completed      12/7/2016 6:21      12/7/2016 7:10

在上面给出的数据集中,列Request.timestampDrop.timestamp包含不同格式的日期值。我们如何在两列中以相同的格式转换日期?我们如何分别提取日期和时间?

2 个答案:

答案 0 :(得分:1)

要转换两种格式的时间,我们需要确定要使用的格式。我使用了lubridate包,因为它比一些标准的R日期格式更容易使用。

rawData <- "Request.id|Pickup.point|Driver.id|Status      |Request.timestamp  |Drop.timestamp
         619   |  Airport   |     1   |Trip Completed|    11/7/2016 11:51|    11/7/2016 13:00
867   |  Airport   |     1   |Trip Completed|    11/7/2016 17:57|    11/7/2016 18:47
1807   |     City   |     1   |Trip Completed|     12/7/2016 9:17|     12/7/2016 9:58
2532   |  Airport   |     1   |Trip Completed|    12/7/2016 21:08|    12/7/2016 22:03
3112   |     City   |     1   |Trip Completed|13-07-2016 08:33:16|13-07-2016 09:25:47
3879   |  Airport   |     1   |Trip Completed|13-07-2016 21:57:28|13-07-2016 22:28:59
4270   |  Airport   |     1   |Trip Completed|14-07-2016 06:15:32|14-07-2016 07:13:15
5510   |  Airport   |     1   |Trip Completed|15-07-2016 05:11:52|15-07-2016 06:07:52
6248   |     City   |     1   |Trip Completed|15-07-2016 17:57:27|15-07-2016 18:50:51
267   |     City   |     2   |Trip Completed|     11/7/2016 6:46|     11/7/2016 7:25
1467   |  Airport   |     2   |Trip Completed|     12/7/2016 5:08|     12/7/2016 6:02
1983   |     City   |     2   |Trip Completed|    12/7/2016 12:30|    12/7/2016 12:57
2784   |  Airport   |     2   |Trip Completed|13-07-2016 04:49:20|13-07-2016 05:23:03
3075   |     City   |     2   |Trip Completed|13-07-2016 08:02:53|13-07-2016 09:16:19
3379   |     City   |     2   |Trip Completed|13-07-2016 14:23:02|13-07-2016 15:35:18
3482   |  Airport   |     2   |Trip Completed|13-07-2016 17:23:18|13-07-2016 18:20:51
4652   |     City   |     2   |Trip Completed|14-07-2016 12:01:02|14-07-2016 12:36:46
5335   |  Airport   |     2   |Trip Completed|14-07-2016 22:24:13|14-07-2016 23:18:52
535   |  Airport   |     3   |Trip Completed|    11/7/2016 10:00|    11/7/2016 10:31
960   |  Airport   |     3   |Trip Completed|    11/7/2016 18:45|    11/7/2016 19:23
1934   |  Airport   |     3   |Trip Completed|    12/7/2016 11:17|    12/7/2016 12:23
2083   |  Airport   |     3   |Trip Completed|    12/7/2016 15:46|    12/7/2016 16:40
2211   |  Airport   |     3   |Trip Completed|    12/7/2016 18:00|    12/7/2016 18:28
3096   |  Airport   |     3   |Trip Completed|13-07-2016 08:17:29|13-07-2016 09:22:37
3881   |  Airport   |     3   |Trip Completed|13-07-2016 21:54:18|13-07-2016 22:51:23
5254   |     City   |     3   |Trip Completed|14-07-2016 21:23:03|14-07-2016 22:25:19
5434   |     City   |     3   |Trip Completed|15-07-2016 02:41:38|15-07-2016 03:24:43
5916   |     City   |     3   |Trip Completed|15-07-2016 10:00:43|15-07-2016 10:53:06
669   |     City   |     4   |Trip Completed|    11/7/2016 13:08|    11/7/2016 13:49
1567   |  Airport   |     4   |Trip Completed|     12/7/2016 6:21|     12/7/2016 7:10"
library(lubridate) 
data <- read.csv(text=rawData,header=TRUE,
                 sep="|",
                 stringsAsFactors=FALSE)

convertTime <- function(aVector){

    unlist(lapply(aVector,function(x){
          ifelse(grepl("/",x),
                 mdy_hm(x),
                 dmy_hms(x))

     }))
}
requestTime <- convertTime(data$Request.timestamp)
dropTime <- convertTime(data$Drop.timestamp)
as_datetime(requestTime)

...和输出:

> as_datetime(requestTime)
 [1] "2016-11-07 11:51:00 UTC" "2016-11-07 17:57:00 UTC" "2016-12-07 09:17:00 UTC"
 [4] "2016-12-07 21:08:00 UTC" "2016-07-13 08:33:16 UTC" "2016-07-13 21:57:28 UTC"
 [7] "2016-07-14 06:15:32 UTC" "2016-07-15 05:11:52 UTC" "2016-07-15 17:57:27 UTC"
[10] "2016-11-07 06:46:00 UTC" "2016-12-07 05:08:00 UTC" "2016-12-07 12:30:00 UTC"
[13] "2016-07-13 04:49:20 UTC" "2016-07-13 08:02:53 UTC" "2016-07-13 14:23:02 UTC"
[16] "2016-07-13 17:23:18 UTC" "2016-07-14 12:01:02 UTC" "2016-07-14 22:24:13 UTC"
[19] "2016-11-07 10:00:00 UTC" "2016-11-07 18:45:00 UTC" "2016-12-07 11:17:00 UTC"
[22] "2016-12-07 15:46:00 UTC" "2016-12-07 18:00:00 UTC" "2016-07-13 08:17:29 UTC"
[25] "2016-07-13 21:54:18 UTC" "2016-07-14 21:23:03 UTC" "2016-07-15 02:41:38 UTC"
[28] "2016-07-15 10:00:43 UTC" "2016-11-07 13:08:00 UTC" "2016-12-07 06:21:00 UTC"
> 

答案 1 :(得分:0)

OP在数据框中以异构格式获得日期/时间。在这种情况下,lubridate非常方便。

library(lubridate)
df <- read.table(text = "Request.id Pickup.point Driver.id            Status   Request.timestamp      Drop.timestamp
1           619      Airport         1    'Trip Completed'     '11/7/2016 11:51'     '11/7/2016 13:00'
2           867      Airport         1    'Trip Completed'     '11/7/2016 17:57'     '11/7/2016 18:47'
3          1807         City         1    'Trip Completed'      '12/7/2016 9:17'      '12/7/2016 9:58'
4          2532      Airport         1    'Trip Completed'     '12/7/2016 21:08'     '12/7/2016 22:03'
5          3112         City         1    'Trip Completed' '13-07-2016 08:33:16' '13-07-2016 09:25:47'
6          3879      Airport         1    'Trip Completed' '13-07-2016 21:57:28' '13-07-2016 22:28:59'
7          4270      Airport         1    'Trip Completed' '14-07-2016 06:15:32' '14-07-2016 07:13:15'
8          5510      Airport         1    'Trip Completed' '15-07-2016 05:11:52' '15-07-2016 06:07:52'
9          6248         City         1    'Trip Completed' '15-07-2016 17:57:27' '15-07-2016 18:50:51'", header = T, stringsAsFactors = F)

#Use parse_date_time to convert hetrogeneous date-time
df$Request.timestamp <- parse_date_time(df$Request.timestamp, c("dmY HM", "dmY HMS"))
df$Drop.timestamp <- parse_date_time(df$Drop.timestamp, c("dmY HM", "dmY HMS"))

df

转换的日期/时间数据为

 Request.id Pickup.point Driver.id         Status   Request.timestamp      Drop.timestamp
1        619      Airport         1 Trip Completed 2016-07-11 11:51:00 2016-07-11 13:00:00
2        867      Airport         1 Trip Completed 2016-07-11 17:57:00 2016-07-11 18:47:00
3       1807         City         1 Trip Completed 2016-07-12 09:17:00 2016-07-12 09:58:00
4       2532      Airport         1 Trip Completed 2016-07-12 21:08:00 2016-07-12 22:03:00
5       3112         City         1 Trip Completed 2016-07-13 08:33:16 2016-07-13 09:25:47
6       3879      Airport         1 Trip Completed 2016-07-13 21:57:28 2016-07-13 22:28:59
7       4270      Airport         1 Trip Completed 2016-07-14 06:15:32 2016-07-14 07:13:15
8       5510      Airport         1 Trip Completed 2016-07-15 05:11:52 2016-07-15 06:07:52
9       6248         City         1 Trip Completed 2016-07-15 17:57:27 2016-07-15 18:50:51

分隔日期和时间的附加代码:

df$Request.timestamp_date <- as.character(df$Request.timestamp, "%Y-%m-%d")
df$Request.timestamp_time <- as.character(df$Request.timestamp, "%H:%M:%S")