Question

大多数stringr函数只是相应stringi函数的包装器。 str_replace_all就是其中之一。但我的代码不适用于stri_replace_all，相应的stringi函数。

我正在编写一个快速正则表达式，用于转换（一部分）驼峰案例到间隔词。

我很疑惑为什么会这样：

str <- "thisIsCamelCase aintIt"
stringr::str_replace_all(str, 
                         pattern="(?<=[a-z])([A-Z])", 
                         replacement=" \\1")
# "this Is Camel Case ain't It"

这不是：

stri_replace_all(str, 
                 regex="(?<=[a-z])([A-Z])", 
                 replacement=" \\1")
# "this 1s 1amel 1ase ain't 1t"

Answer 1

如果查看stringr::str_replace_all的来源，您会看到它调用fix_replacement(replacement)将\\#捕获组引用转换为$#。但stringi:: stri_replace_all上的帮助也清楚地表明您对捕获组使用$1，$2等。

str <- "thisIsCamelCase aintIt"
stri_replace_all(str, regex="(?<=[a-z])([A-Z])", replacement=" $1")
## [1] "this Is Camel Case aint It"

Answer 2

以下选项应在两种情况下都返回相同的输出。

pat <- "(?<=[a-z])(?=[A-Z])"
str_replace_all(str, pat, " ")
#[1] "this Is Camel Case aint It"
stri_replace_all(str, regex=pat, " ")
#[1] "this Is Camel Case aint It"

根据?stri_replace_all的帮助页面，有一些示例建议$1，$2用于替换

stri_replace_all_regex('123|456|789', '(\\p{N}).(\\p{N})', '$2-$1')

因此，如果我们将\\1替换为$1

，它应该会有效

stri_replace_all(str, regex = "(?<=[a-z])([A-Z])", " $1")
#[1] "this Is Camel Case aint It"

在str_replace / stri_replace中使用捕获的组 - stringi vs stringr

2 个答案: