通过分解成小块来解释javascript正则表达式

时间:2013-06-13 08:13:19

标签: javascript regex

以下是从youtube网址中提取视频ID的功能。

 function youtubeLinkParser(url) {
            var regExp = /^.*(youtu.be\/|v\/|u\/\w\/|embed\/|watch\?v=|\&v=)([^#\&\?]*).*/;
            var match = url.match(regExp);
            if (match && match[2].length == 11) {
                return match[2];
            } else {
                return null;
            }
        }

我是正则表达式的新手,所以任何一个人都会将正则表达式分解成更小的部分并解释它是如何工作的。

3 个答案:

答案 0 :(得分:2)

以下是Yape :: Regex :: Explain

的解释
The regular expression:

(?-imsx:^.*(youtu.be/|v/|u/\w/|embed/|watch\?v=|\&v=)([^#\&\?]*).*)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    youtu                    'youtu'
----------------------------------------------------------------------
    .                        any character except \n
----------------------------------------------------------------------
    be/                      'be/'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    v/                       'v/'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    u/                       'u/'
----------------------------------------------------------------------
    \w                       word characters (a-z, A-Z, 0-9, _)
----------------------------------------------------------------------
    /                        '/'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    embed/                   'embed/'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    watch                    'watch'
----------------------------------------------------------------------
    \?                       '?'
----------------------------------------------------------------------
    v=                       'v='
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    \&                       '&'
----------------------------------------------------------------------
    v=                       'v='
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    [^#\&\?]*                any character except: '#', '\&', '\?' (0
                             or more times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

答案 1 :(得分:0)

^.*

一开始,可能有什么。

然后是其中一件事:

youtu.be/    (that's the intention, but actually the dot can be any char)
v/
u/some letter/
embed/
watch?v=
&v=

上面的东西变得匹配[1]。

然后出现零个或多个不是#&的字符。要么 ? 那些角色变得匹配[2]。

最终来了。

答案 2 :(得分:0)

/^.*以任何角色开头

(youtu.be\/|v\/|u\/\w\/|embed\/|watch\?v=|\&v=)

匹配其中任何一个:

  • youtu * be /< - 可能应该是youtu.be
  • v /
  • 你/ w /
  • 看?v =
  • &安培; V =

    ([^#\&安培; \ *)

然后是任何事情,但#&和?符号

.*/

任何角色直到最后