方括号后的正则表达式管道

时间:2014-05-02 09:34:57

标签: regex

我发现了一个我不太理解的正则表达式。

看起来像这样:

([|)\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b(]|)

我明白它试图匹配一些数字,如255.255,它应该是一个完整的单词。

但是"([|)" "(] |)"对于?方括号和最后一个中的管道也是错误的顺序。

2 个答案:

答案 0 :(得分:1)

krackmoe,有趣的是,没有([|):这是一种视错觉。

正则表达式引擎没有看到([|)

它看到(打开捕获组1,然后它看到一个字符类[|)\b(25[0-5]由于几个原因而没有多大意义。例如,\b与文字字符“b”匹配,字符2和5与范围0-5是多余的。

所以你完全不理解它。

我认为作者想在那里放一个单词边界,但就目前而言,这是一个错字。

作为参考,这里是正则表达式的逐个令牌解释。 (别担心,我没有输入所有内容,它是由RegexBuddy自动生成的。)

* Match the regex below and capture its match into backreference number 1 `([|)\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)`
    * Match this alternative (attempting the next alternative only if this one fails) `[|)\b(25[0-5]`
        * Match a single character present in the list below `[|)\b(25[0-5]`
            * A single character from the list “|)” `|)`
            * The character `\b`
            * A single character from the list “(25[” `(25[`
            * A character in the range between “0” and “5” `0-5`
    * Or match this alternative (attempting the next alternative only if this one fails) `2[0-4][0-9]`
        * Match the character “2” literally `2`
        * Match a single character in the range between “0” and “4” `[0-4]`
        * Match a single character in the range between “0” and “9” `[0-9]`
    * Or match this alternative (the entire group fails if this one fails to match) `[01]?[0-9][0-9]?`
        * Match a single character from the list “01” `[01]?`
            * Between zero and one times, as many times as possible, giving back as needed (greedy) `?`
        * Match a single character in the range between “0” and “9” `[0-9]`
        * Match a single character in the range between “0” and “9” `[0-9]?`
            * Between zero and one times, as many times as possible, giving back as needed (greedy) `?`
* Match the character “.” literally `\.`
* Match the regex below and capture its match into backreference number 2 `(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)`
    * Match this alternative (attempting the next alternative only if this one fails) `25[0-5]`
        * Match the character string “25” literally `25`
        * Match a single character in the range between “0” and “5” `[0-5]`
    * Or match this alternative (attempting the next alternative only if this one fails) `2[0-4][0-9]`
        * Match the character “2” literally `2`
        * Match a single character in the range between “0” and “4” `[0-4]`
        * Match a single character in the range between “0” and “9” `[0-9]`
    * Or match this alternative (the entire group fails if this one fails to match) `[01]?[0-9][0-9]?`
        * Match a single character from the list “01” `[01]?`
            * Between zero and one times, as many times as possible, giving back as needed (greedy) `?`
        * Match a single character in the range between “0” and “9” `[0-9]`
        * Match a single character in the range between “0” and “9” `[0-9]?`
            * Between zero and one times, as many times as possible, giving back as needed (greedy) `?`
* Assert position at a word boundary (position preceded or followed—but not both—by a Unicode letter, digit, or underscore) `\b`
* Match the regex below and capture its match into backreference number 3 `(]|)`
    * Match this alternative (attempting the next alternative only if this one fails) `]`
        * Match the character “]” literally `]`
    * Or match this alternative (the entire group fails if this one fails to match)

答案 1 :(得分:1)

正则表达式的目的尚不清楚。 Debuggex可以很好地实现可视化。

Regular expression visualization

Debuggex Demo

约0~255的部分是清楚的(000,00也是可接受的值)。但是,尝试匹配|)([]符号的原因不明。

我认为由于错误,首先显示[和最后]。没有它们,内部正则表达式看起来很合理。但是(|)\b也看起来不对,所以我的猜测是我们也可以省略(|)

(|)\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b(|)

Regular expression visualization

Debuggex Demo