Python正则表达式分组?

时间:2014-11-20 05:58:21

标签: python regex grouping email-validation

我一直在尝试查找代码以匹配项目电子邮件中的正则表达式。这些是要求:

电子邮件必须采用acct @ domain

的形式

  • acct是1个或多个字符,仅由大写或小写字母字符,数字字符,短划线,句点,下划线和连字符组成
  • acct无法以下划线,短划线,句点或连字符开头或结尾。每个时期之前和之后必须至少有两个字母。
  • 域名为5个或更多字符,仅由大写或小写字母字符,数字字符,短划线,句点和连字符组成,下划线
  • 域必须至少有一个句点,并且不能以下划线,短划线,句点或连字符开头或结尾。每个时期之前和之后必须至少有两个字母。
  • 我已经找到了代码中的acct部分:

    if re.search("^[a-zA-z0-9]+[a-zA-z0-9-_]*$|^[a-zA-z0-9]+[a-zA-z0-9-_]+[\.]{1}[a-zA-z0-9]{2,}$", email):
        print "valid!"
    

    也是域名:

    if re.search("^[a-zA-z0-9]+[a-zA-z0-9-_]+[\.]{1}[a-zA-z0-9]{2,}$", email):
        print "valid!"
    

    我的问题是我无法弄清楚如何将它们组合在一起并放上一个@符号

    我尝试了以下但是它似乎没有效果。

    if re.search("(^[a-zA-z0-9]+[a-zA-z0-9-_]*$|^[a-zA-z0-9]+[a-zA-z0-9-_]+[\.]{1}[a-zA-z0-9]{2,}$)@(^[a-zA-z0-9]+[a-zA-z0-9-_]+[\.]{1}[a-zA-z0-9]{2,}$)", email):<br>
        print "valid!
    

    &#34;

    它没有工作!我无法让它永远匹配。如果您有建议使代码不那么引人注目,请告诉我!

    3 个答案:

    答案 0 :(得分:1)

    摆脱两组中的锚点并将其应用于整个组

    if re.search(r"^(?:[a-zA-Z0-9]+[a-zA-Z0-9-]*|[a-zA-Z0-9]+[a-zA-Z0-9-]+\.[a-zA-Z0-9]{2,})@[a-zA-Z0-9]+[-\w]+\.[a-zA-Z0-9]{2,}$", email):
        print "valid!"
    

    所做的更改

    • 锚点^$适用于整个正则表达式

    • [\.]{1}可简化为\.,因为它仅匹配.一个

    • [a-zA-z0-9-_]可简化为[-\w]

    答案 1 :(得分:1)

    使用non-capturing group合并两个正则表达式。

    if re.search(r"^(?:[a-zA-Z0-9]+[a-zA-Z0-9-]*|[a-zA-Z0-9]+[a-zA-Z0-9-]+[.][a-zA-Z0-9]{2,})@[a-zA-Z0-9]+[a-zA-Z0-9-_]+[.][a-zA-Z0-9]{2,}$", email):
        print "valid"
    

    DEMO

    正则表达式:

    ^                        the beginning of the string
    (?:                      group, but do not capture:
      [a-zA-Z0-9]+             any character of: 'a' to 'z', 'A' to
                               'Z', '0' to '9' (1 or more times)
      [a-zA-Z0-9-]*            any character of: 'a' to 'z', 'A' to
                               'Z', '0' to '9', '-' (0 or more times)
     |                        OR
      [a-zA-Z0-9]+             any character of: 'a' to 'z', 'A' to
                               'Z', '0' to '9' (1 or more times)
      [a-zA-Z0-9-]+            any character of: 'a' to 'z', 'A' to
                               'Z', '0' to '9', '-' (1 or more times)
      [.]                      any character of: '.'
      [a-zA-Z0-9]{2,}          any character of: 'a' to 'z', 'A' to
                               'Z', '0' to '9' (at least 2 times)
    )                        end of grouping
    @                        '@'
    [a-zA-Z0-9]+             any character of: 'a' to 'z', 'A' to 'Z',
                             '0' to '9' (1 or more times)
    [a-zA-Z0-9-_]+           any character of: 'a' to 'z', 'A' to 'Z',
                             '0' to '9', '-', '_' (1 or more times)
    [.]                      any character of: '.'
    [a-zA-Z0-9]{2,}          any character of: 'a' to 'z', 'A' to 'Z',
                             '0' to '9' (at least 2 times)
    $                        before an optional \n, and the end of the
                             string
    

    答案 2 :(得分:1)

    以下是可以验证所有标准的正则表达式,我希望它也更有效。

    ^(?![\W_])((?:([\w-]{2,})\.?){1,})(?<![\W_])@(?![\W_])(?=[\w.-]{5,})(?=.+\..+)((?:([\w-]{2,})\.?){1,})(?<![\W_])$
    

    这是regex demo