如何重新安排日期和时间

时间:2016-07-06 03:42:35

标签: r

您能告诉我如何重新排列数据集A的日期时间,以便与数据集B的日期时间兼容(格式为GMT + 10格式)?     谢谢。

**data set A**

sitecode status  start                           end
ANS0009  spike   11/09/2013 04:45:00 PM (GMT+11) 11/09/2013 05:00:00 PM (GMT+11)
ARM0064  spike   05/03/2014 11:00:00 AM (GMT+10) 05/03/2014 11:15:00 AM (GMT+10)
BAS0059  dry     13/01/2013 00:00:00 AM (GMT+11) 29/03/2013 11:45:00 PM (GMT+11)
BAS0059  spike   11/03/2014 10:15:00 AM (GMT+10) 11/03/2014 10:30:00 AM (GMT+10)
BLC0097  failure 12/20/2012 05:00:00 PM (GMT+11) 12/31/2012 11:45:00 PM (GMT+11)
BLC0097  spike   24/12/2015 04:59:45 PM (GMT+10) 24/12/2015 05:01:50 PM (GMT+10)


**data set B**

sitecode  status               start                 end
EUM0056  record   2012-12-01 11:00:00   2013-10-06 01:45:00
EUM0056 missing   2013-10-06 01:45:00   2013-10-06 03:00:00
EUM0056  record   2013-10-06 03:00:00   2014-03-11 20:15:00
MDL0026  record   2012-12-07 11:00:00   2013-04-04 19:45:00
MDL0026 missing   2013-04-04 19:45:00   2014-02-27 23:00:00
MDL0026  record   2014-02-27 23:00:00   2014-10-05 01:45:00

2 个答案:

答案 0 :(得分:0)

我们可以使用lubridate在将字符串拆分为两个后解析多种格式,以删除(GMT + ...)

library(lubridate)
library(stringr)
v1 <- strsplit(str1, "\\s+(?=\\()", perl = TRUE)[[1]]
parse_date_time(v1[1], c("%d/%m/%Y %I:%M:%S %p", "%m/%d/%Y %I:%M:%S %p"),
          tz= "GMT", exact = TRUE) + lubridate::hours(str_extract(v1[2], "\\d+"))
#[1] "2013-09-12 03:45:00 GMT"

使用完整数据集示例

datA[c("start", "end")] <-  lapply(datA[c("start", "end")], function(x){
       m1 <-  do.call(rbind, strsplit(x, "\\s+(?=\\()", perl = TRUE))
      parse_date_time(m1[,1], c("%d/%m/%Y %I:%M:%S %p", "%m/%d/%Y %I:%M:%S %p"),
       tz = "GMT", exact = TRUE) + lubridate::hours(str_extract(m1[,2], "\\d+")
    )})

数据

str1 <- "11/09/2013 04:45:00 PM (GMT+11)"

答案 1 :(得分:0)

require(lubridate)

exampleA <- c("11/09/2013 04:45:00 PM (GMT+11)",
              "11/09/2013 04:45:00 PM (GMT+10)")
exampleA      <- as.data.frame(exampleA)
exampleA$flag <- 0
exampleA$flag[grep(" PM \\(GMT\\+11\\)", exampleA$exampleA)] <- 1
exampleA$exampleA <- gsub(" PM \\(GMT\\+11\\)","", exampleA$exampleA)
exampleA$exampleA <- gsub(" PM \\(GMT\\+10\\)","", exampleA$exampleA)
exampleA$exampleA <- mdy_hms(exampleA$exampleA)
exampleA$exampleA[exampleA$flag == 1] <- exampleA$exampleA - 3600

exampleB <- c("2013-11-09 03:45:00", "2013-11-09 04:45:00")
exampleB <- ymd_hms(exampleB)

# Proof it works
exampleA$exampleA == exampleB
  

[1] TRUE TRUE

如果您在1个数据集中混合使用格式(即mdy,ydm等),则可以使用if语句处理此问题 - 可以使用applyfor循环 - 和文字if某个位置的值为>12以确定格式,然后使用相应的lubridate函数进行转换。