我最近实现了一个函数来转义可解释为正则表达式的字符,这些正则表达式用于我的R包的系统调用' rNOMADS'
SanitizeWGrib2Inputs <- function(check.strs) {
#Escape regex characters before inputting to wgrib2
#INPUTS
# CHECK.STRS - Strings possibly containing regex metacharacters
#OUTPUTS
# CHECKED.STRS - Strings with metacharacters appropriately escaped
meta.chars <- paste0("\\", c("(", ")", ".", "+", "*", "^", "$", "?", "[", "]", "|"))
for(k in 1:length(meta.chars)) {
check.strs <- stringr::str_replace(check.strs, meta.chars[k], paste0("\\\\", meta.chars[k]))
}
checked.strs <- check.strs
return(checked.strs)
}
我在我的包文档中包含了一个示例:
check.strs <- c("frank", "je.rry", "har\\old", "Johnny Ca$h")
checked.strs <- SanitizeWGrib2Inputs(check.strs)
这在我的ubuntu机器上正常工作并通过了CRAN检查。但是,当我将软件包上传到CRAN时,他们的Windows检查器说:
> check.strs <- c("frank", "je.rry", "har\\old", "Johnny Ca$h")
> checked.strs <- SanitizeWGrib2Inputs(check.strs) Error in stri_replace_first_regex(string, pattern, fix_replacement(replacement), : Invalid capture group name. (U_REGEX_INVALID_CAPTURE_GROUP_NAME) Calls: SanitizeWGrib2Inputs -> <Anonymous> -> stri_replace_first_regex -> .Call Execution halted
** running examples for arch 'x64' ... ERROR Running examples in 'rNOMADS-Ex.R' failed The error most likely occurred in:
> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: SanitizeWGrib2Inputs
> ### Title: Make sure regex metacharacters are properly escaped
> ### Aliases: SanitizeWGrib2Inputs
> ### Keywords: internal
>
> ### ** Examples
>
>
> check.strs <- c("frank", "je.rry", "har\\old", "Johnny Ca$h")
> checked.strs <- SanitizeWGrib2Inputs(check.strs) Error in stri_replace_first_regex(string, pattern, fix_replacement(replacement), : Invalid capture group name. (U_REGEX_INVALID_CAPTURE_GROUP_NAME) Calls: SanitizeWGrib2Inputs -> <Anonymous> -> stri_replace_first_regex -> .Call Execution halted
我在Windows分区上验证了此行为。为此做了什么工作?
答案 0 :(得分:2)
使用简单的str_replace_all
来转义所有特殊的正则表达式元字符:
SanitizeWGrib2Inputs <- function(check.strs) {
return(str_replace_all(check.strs, "[{\\[()|?$^*+.\\\\]", "\\$0"))
}
check.strs <- c("frank", "je.rry", "har\\old", "Johnny Ca$h")
checked.strs <- SanitizeWGrib2Inputs(check.strs)
checked.strs
## => [1] "frank" "je\\.rry" "har\\\\old" "Johnny Ca\\$h"
备注强>:
"[{\\[\\]()|?$^*+.\\\\]
(实际上,"[{\[\]()|?$^*+.\\]
)将匹配任何单个字符,{
,[
,]
,(
,{ {1}},)
,|
,?
,$
,^
,*
,+
或{{1 }} .
替换会将每个字符更改为\
+相同的字符("\\$0"
是对整个匹配值的反向引用)。\
,因为如果前面没有配对的开放$0
,则在字符类之外它不是特殊字符。