是否有正则表达式用于保留\U
和\L
在下面的示例中,我想将"date"
转换为"month"
,同时保持input
中使用的大小写
from to
"date" ~~> "month"
"Date" ~~> "Month"
"DATE" ~~> "MONTH"
我目前使用三个嵌套调用sub
来完成此操作。
input <- c("date", "Date", "DATE")
expected.out <- c("month", "Month", "MONTH")
sub("date", "month",
sub("Date", "Month",
sub("DATE", "MONTH", input)
)
)
目标是拥有一个pattern
和一个replace
,例如
gsub("(date)", "\\Umonth", input, perl=TRUE)
将产生所需的输出
答案 0 :(得分:7)
当我认为for
循环合理时,这是其中一种情况:
input <- rep("Here are a date, a Date, and a DATE",2)
pat <- c("date", "Date", "DATE")
ret <- c("month", "Month", "MONTH")
for(i in seq_along(pat)) { input <- gsub(pat[i],ret[i],input) }
input
#[1] "Here are a month, a Month, and a MONTH"
#[2] "Here are a month, a Month, and a MONTH"
另一种礼貌的@flodel
实现与循环Reduce
相同的逻辑:
Reduce(function(str, args) gsub(args[1], args[2], str),
Map(c, pat, ret), init = input)
有关这些选项的基准测试,请参阅@ TylerRinker的答案。
答案 1 :(得分:5)
使用gsubfn
包,可以避免使用嵌套的子功能,并在一次调用中执行此操作。
> library(gsubfn)
> x <- 'Here we have a date, a different Date, and a DATE'
> gsubfn('date', list('date'='month','Date'='Month','DATE'='MONTH'), x, ignore.case=T)
# [1] "Here we have a month, a different Month, and a MONTH"
答案 2 :(得分:4)
这是一个qdap方法。非常直接但不是最快的:
input <- rep("Here are a date, a Date, and a DATE",2)
pat <- c("date", "Date", "DATE")
ret <- c("month", "Month", "MONTH")
library(qdap)
mgsub(pat, ret, input)
## [1] "Here are a month, a Month, and a MONTH"
## [2] "Here are a month, a Month, and a MONTH"
基准:
input <- rep("Here are a date, a Date, and a DATE",1000)
library(microbenchmark)
(op <- microbenchmark(
GSUBFN = gsubfn('date', list('date'='month','Date'='Month','DATE'='MONTH'),
input, ignore.case=T),
QDAP = mgsub(pat, ret, input),
REDUCE = Reduce(function(str, args) gsub(args[1], args[2], str),
Map(c, pat, ret), init = input),
FOR = function() {
for(i in seq_along(pat)) {
input <- gsub(pat[i],ret[i],input)
}
input
},
times=100L))
## Unit: milliseconds
## expr min lq median uq max neval
## GSUBFN 682.549812 815.908385 847.361883 925.385557 1186.66743 100
## QDAP 10.499195 12.217805 13.059149 13.912157 25.77868 100
## REDUCE 4.267602 5.184986 5.482151 5.679251 28.57819 100
## FOR 4.244743 5.148132 5.434801 5.870518 10.28833 100