Question

我对\B和\b有点概念。而且，accordinlgy尝试了一个代码（取自互联网），但无法理解 - 那些regexp Anchors的输出是如何产生的。那么任何人都可以帮助我理解\B和\b之间的区别，在内部说明他们如何在Ruby pattern matching中进行处理？

Interactive ruby ready.
> str = "Hit him on the head\n" +
      "Hit him on the head with a 2×4\n"
=> "Hit him on the head
Hit him on the head with a 2??4
"
> str.scan(/\w+\B/)
=> ["Hi", "hi", "o", "th", "hea", "Hi", "hi", "o", "th", "hea", "wit"]
> str.scan(/\w+\b/)
=> ["Hit", "him", "on", "the", "head", "Hit", "him", "on", "the", "head", "with", "a", "2", "4"]
>

谢谢，

Answer 1

与大多数低/大写对一样，它们完全相反：

\b匹配字边界 - 也就是说，它匹配两个字母之间的（因为它是零宽度匹配，即它不消耗匹配时的一个字符）其中一个属于一个单词而另一个不属于单词。在“this person”文字中，\b会匹配以下位置（由竖线表示）：“|this| |person|”。

\B匹配，但在字边界。它会匹配这些位置：“t|h|i|s p|e|r|s|o|n” - 即所有字母之间，但不是字母和非字母字符之间。

因此，如果您有\w+\b并且匹配“this person”，那么您会得到“this”，因为+是贪婪的并且匹配尽可能多的单词字符（{尽可能{1}}），直到下一个单词边界。

\w的操作方式类似，但无法与“\w+\B”匹配，因为后面跟着this禁止的字边界。因此引擎会回溯一个字符，而是匹配“\B”。

Regexp如何锚定\ B和\ b彼此不同？

1 个答案: