Question

为什么逃避倾斜括号>的逃逸表现出前瞻性行为？

要清楚，我知道有角度的支架不需要逃脱问题是，如何解释模式产生显示的匹配

## match bracket, with or without underscore
## replace with "greater_"
strings <- c("ten>eight", "ten_>_eight")
repl    <- "greater_"

## Unescaped. Yields desired results
gsub(">_?", repl, strings)
#  [1] "tengreater_eight"  "ten_greater_eight"

## All four of these yield the same result
gsub("\\>_?",   repl, strings)  # (a)
gsub("\\>(_?)", repl, strings)  # (b)
gsub("\\>(_)?", repl, strings)  # (c)
gsub("\\>?",    repl, strings)  # (d)
gsub("\\>",     repl, strings)  # (e)
#  [1] "tengreater_>eightgreater_"   "ten_greater_>_eightgreater_"

gregexpr("\\>?", strings)

一些跟进问题：

1.  Why do `(a)` and `(d)` yield the same result? 
2.  Why is the end-of-string matched?
3.  Why do none of `a, b, or c` match the underscore?

Answer 1

\\>是一个单词边界，它在单词字符（在左侧）和非单词字符（在右侧）或行锚$的结尾之间匹配。

> strings <- c("ten>eight", "ten_>_eight")
> gsub("\\>", "greater_", strings)
[1] "tengreater_>eightgreater_"   "ten_greater_>_eightgreater_"

在上面的示例中，它仅匹配n后的单词字符和非单词字符>之间的单词边界，以及t和行尾之间的边界在第一个元素中锚定。它在_（也是一个单词字符）和>之间匹配，然后在t和第二个元素中的行锚点（即$）之间进行匹配。最后，它用你指定的字符串替换匹配的边界。

一个简单的例子：

> gsub("\\>", "*", "f:r(:")
[1] "f*:r*(:"

考虑以下输入字符串。（ w表示单词字符，N表示非单词字符）

    f:r(:
w___|||||
     |w|N
     N |
       |
       N

所以\\>匹配，

f和:
r和(

示例2：

> gsub("\\>", "*", "f") 
[1] "f*"

输入字符串：

f$
||----End of the line anchor
w

用*替换匹配的边界将得到上述结果。

逃逸角度支架的行为类似于前瞻

1 个答案: