在Perl中解码正则表达式

时间:2012-02-02 18:08:12

标签: regex perl

任何人都可以解码这个正则表达式在Perl中的含义:

while (/([0-9a-zA-Z\-]+(?:'[a-zA-Z0-9\-]+)*)/g)

3 个答案:

答案 0 :(得分:3)

以下是正则表达式的细分:

(                     # start a capturing group (1)
   [0-9a-zA-Z-]+      # one or more digits or letters or hyphens
   (?:                # start a non-capturing group
      '               # a literal single quote character
      [a-zA-Z0-9-]+   # one or more digits or letters or hyphens
   )*                 # repeat non-capturing group zero or more times
)                     # end of capturing group 1

正则表达式采用/.../g形式且在while循环中,这意味着while内部的代码将针对正则表达式的每个非重叠匹配运行。

答案 1 :(得分:3)

有一个工具:YAPE::Regex::Explain

The regular expression:

(?-imsx:([0-9a-zA-Z\-]+(?:'[a-zA-Z0-9\-]+)*))

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [0-9a-zA-Z\-]+           any character of: '0' to '9', 'a' to
                             'z', 'A' to 'Z', '\-' (1 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
    (?:                      group, but do not capture (0 or more
                             times (matching the most amount
                             possible)):
----------------------------------------------------------------------
      '                        '\''
----------------------------------------------------------------------
      [a-zA-Z0-9\-]+           any character of: 'a' to 'z', 'A' to
                               'Z', '0' to '9', '\-' (1 or more times
                               (matching the most amount possible))
----------------------------------------------------------------------
    )*                       end of grouping
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

答案 2 :(得分:2)

F.J的回答是一个完美的细分。但是......他遗漏了一个重要的部分,最后是/ g。它告诉解析器从上次停止的地方继续。所以while循环将继续循环遍历字符串,直到它到达没有其他匹配点的点。