R中的ISO 8601扩展持续时间格式PnYnMnDTnHnMnS

时间:2017-12-08 17:01:56

标签: r datetime duration lubridate

有没有办法解析ISO 8601 duration,例如" P3Y6M4DT12H30M5S"并返回类似的东西: " 3年,6个月,4天,12:30:05"

我对lubridate的持续时间或parsedate包裹没有好运。

1 个答案:

答案 0 :(得分:2)

我也不知道任何这样的包(可能存在)要解决,但你可以使用正则表达式以你的方式解析它,因为模式将被修复(" PnYnMnDTnHnMnS"),根据到维基百科:

gsub("P(\\d+)Y(\\d+)M(\\d+)DT(\\d+)H(\\d+)M(\\d+)S", "\\1 Years, \\2
Months, \\3 Days, \\4:\\5:\\6", "P3Y6M4DT12H30M5S")

<强>输出

[1] "3 Years, 6 Months, 4 Days, 12:30:5"

修改

如果你只对填充零秒而不是其他任何东西感兴趣,我在这里添加了两个元素在向量中,一个数字为秒,另外两个数字为秒(假设秒不会超过60)到验证正则表达式:

    vect <- c("P3Y6M4DT12H30M5S", "P3Y6M4DT12H30M15S")
    ifelse(grepl(".*M(\\d)S", vect), gsub("P(\\d+)Y(\\d+)M(\\d+)DT(\\d+)H(\\d+)M(\\d)S", "\\1 Years, \\2 Months, \\3 Days, \\4:\\5:0\\6", vect), gsub("P(\\d+)Y(\\d+)M(\\d+)DT(\\d+)H(\\d+)M(\\d+)S", "\\1 Years, \\2 Months, \\3 Days, \\4:\\5:\\6", vect))

<强>输出:

[1] "3 Years, 6 Months, 4 Days, 12:30:05"  
[2] "3 Years, 6 Months, 4 Days, 12:30:15"

如果您有兴趣填充单个数字的每个元素:

topad <- gsub("P(\\d+)Y(\\d+)M(\\d+)DT(\\d+)H(\\d+)M(\\d+)S", "\\1-\\2-\\3-\\4-\\5-\\6", vect)
library(stringr)
splitvect <- strsplit(topad,split="-")
unlist(lapply(splitvect, function(x)paste0(str_pad(x, 2, "0", side="left"), c("Years, ", "Months, ", "Days, ", ":", ":", ""), collapse= "")))

<强>输出:

[1] "03Years, 06Months, 04Days, 12:30:05"
[2] "03Years, 06Months, 04Days, 12:30:15"