您能告诉我如何重新排列数据集A的日期时间,以便与数据集B的日期时间兼容(格式为GMT + 10格式)? 谢谢。
**data set A**
sitecode status start end
ANS0009 spike 11/09/2013 04:45:00 PM (GMT+11) 11/09/2013 05:00:00 PM (GMT+11)
ARM0064 spike 05/03/2014 11:00:00 AM (GMT+10) 05/03/2014 11:15:00 AM (GMT+10)
BAS0059 dry 13/01/2013 00:00:00 AM (GMT+11) 29/03/2013 11:45:00 PM (GMT+11)
BAS0059 spike 11/03/2014 10:15:00 AM (GMT+10) 11/03/2014 10:30:00 AM (GMT+10)
BLC0097 failure 12/20/2012 05:00:00 PM (GMT+11) 12/31/2012 11:45:00 PM (GMT+11)
BLC0097 spike 24/12/2015 04:59:45 PM (GMT+10) 24/12/2015 05:01:50 PM (GMT+10)
**data set B**
sitecode status start end
EUM0056 record 2012-12-01 11:00:00 2013-10-06 01:45:00
EUM0056 missing 2013-10-06 01:45:00 2013-10-06 03:00:00
EUM0056 record 2013-10-06 03:00:00 2014-03-11 20:15:00
MDL0026 record 2012-12-07 11:00:00 2013-04-04 19:45:00
MDL0026 missing 2013-04-04 19:45:00 2014-02-27 23:00:00
MDL0026 record 2014-02-27 23:00:00 2014-10-05 01:45:00
答案 0 :(得分:0)
我们可以使用lubridate
在将字符串拆分为两个后解析多种格式,以删除(GMT + ...)
。
library(lubridate)
library(stringr)
v1 <- strsplit(str1, "\\s+(?=\\()", perl = TRUE)[[1]]
parse_date_time(v1[1], c("%d/%m/%Y %I:%M:%S %p", "%m/%d/%Y %I:%M:%S %p"),
tz= "GMT", exact = TRUE) + lubridate::hours(str_extract(v1[2], "\\d+"))
#[1] "2013-09-12 03:45:00 GMT"
使用完整数据集示例
datA[c("start", "end")] <- lapply(datA[c("start", "end")], function(x){
m1 <- do.call(rbind, strsplit(x, "\\s+(?=\\()", perl = TRUE))
parse_date_time(m1[,1], c("%d/%m/%Y %I:%M:%S %p", "%m/%d/%Y %I:%M:%S %p"),
tz = "GMT", exact = TRUE) + lubridate::hours(str_extract(m1[,2], "\\d+")
)})
str1 <- "11/09/2013 04:45:00 PM (GMT+11)"
答案 1 :(得分:0)
require(lubridate)
exampleA <- c("11/09/2013 04:45:00 PM (GMT+11)",
"11/09/2013 04:45:00 PM (GMT+10)")
exampleA <- as.data.frame(exampleA)
exampleA$flag <- 0
exampleA$flag[grep(" PM \\(GMT\\+11\\)", exampleA$exampleA)] <- 1
exampleA$exampleA <- gsub(" PM \\(GMT\\+11\\)","", exampleA$exampleA)
exampleA$exampleA <- gsub(" PM \\(GMT\\+10\\)","", exampleA$exampleA)
exampleA$exampleA <- mdy_hms(exampleA$exampleA)
exampleA$exampleA[exampleA$flag == 1] <- exampleA$exampleA - 3600
exampleB <- c("2013-11-09 03:45:00", "2013-11-09 04:45:00")
exampleB <- ymd_hms(exampleB)
# Proof it works
exampleA$exampleA == exampleB
[1] TRUE TRUE
如果您在1个数据集中混合使用格式(即mdy,ydm等),则可以使用if
语句处理此问题 - 可以使用apply
或for
循环 - 和文字if
某个位置的值为>12
以确定格式,然后使用相应的lubridate
函数进行转换。