描述

Question

我还是regex的新手，我正在尝试创建一个正则表达式来验证我正在创建的应用的ID。

id约束如下 -

只能以A-Z，a-z，,，'，-开头。
可以包含上述所有内容以及.，只是不在开头。
必须至少有两个A-Z | a-z个字符
字符只能出现一次。（,,不应该匹配，只有,）

编辑：我不清楚第四点，它应该只禁止连续符号，而不是连续的字母。

到目前为止，我所拥有的只是

^(A-Za-z',-)(A-Za-z',-\\.)+$        // I'm using java hence the reason for the `\\.`

我不知道如何在我的正则表达式中匹配特定数量的内容。我认为这很简单，但任何帮助都会非常有用。

我对正则表达式非常陌生，我真的很失落如何做到这一点。

编辑：最终正则表达式如下

^(?=.*[A-Za-z].*[A-Za-z].*)(?!.*(,|'|\-|\.)\1.*)[A-Za-z,'\-][A-Za-z,'\-\.]*

感谢Ro Yo Mi和RebelWitoutAPulse！

Answer 1

描述

^(?!\.)(?=(?:.*?[A-Za-z]){2})(?:([a-zA-Z,'.-])(?!.*?\1))+$

Regular expression visualization

此正则表达式将执行以下操作：

(?!\.)
- 验证字符串不以.
(?=(?:.*?[A-Za-z]){2})
- 验证字符串是否至少包含两个A-Z | a-z字符
(?:([a-zA-Z,'.-])(?!.*?\1))+
- 允许字符串仅包含a-z，A-Z，,，.，-
- 允许字符仅显示一次。（,,不应该匹配，只有,）

实施例

现场演示

https://regex101.com/r/hO2mU1/1

示例文字

-abced
aabdefsa
abcdefs
.abded
ac.dC
ab
a.b

样本匹配

-abced
abcdefs
ac.dC
ab
a.b

解释

NODE                     EXPLANATION
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
----------------------------------------------------------------------
    \.                       '.'
----------------------------------------------------------------------
  )                        end of look-ahead
----------------------------------------------------------------------
  (?=                      look ahead to see if there is:
----------------------------------------------------------------------
    (?:                      group, but do not capture (2 times):
----------------------------------------------------------------------
      .*?                      any character (0 or more times
                               (matching the least amount possible))
----------------------------------------------------------------------
      [A-Za-z]                 any character of: 'A' to 'Z', 'a' to
                               'z'
----------------------------------------------------------------------
    ){2}                     end of grouping
----------------------------------------------------------------------
  )                        end of look-ahead
----------------------------------------------------------------------
  (?:                      group, but do not capture (1 or more times
                           (matching the most amount possible)):
----------------------------------------------------------------------
    (                        group and capture to \1:
----------------------------------------------------------------------
      [a-zA-Z,'.-]             any character of: 'a' to 'z', 'A' to
                               'Z', ',', ''', '.', '-'
----------------------------------------------------------------------
    )                        end of \1
----------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
----------------------------------------------------------------------
      .*?                      any character (0 or more times
                               (matching the least amount possible))
----------------------------------------------------------------------
      \1                       what was matched by capture \1
----------------------------------------------------------------------
    )                        end of look-ahead
----------------------------------------------------------------------
  )+                       end of grouping
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------

Answer 2

您可以使用正/负lookahead。对这种技术的粗略解释是，当正则表达式处理器遇到它时 - 它会暂停进一步的正则表达式处理，直到前瞻中定义的子规则匹配为止。

正则表达式可能是：

^(?=.*[A-Za-z].*[A-Za-z].*)(?!.*(.)\1.*)[A-Za-z,'\-][A-Za-z,'\-\.]*

说明：

^ - 字符串的开头
(?=.*[A-Za-z].*[A-Za-z].*) - 仅在字符串包含任意数量的字符时继续匹配，然后是a-Z的内容，然后是任意数量的任何字符，然后再次a-Z，然后再进行任何操作。这有效地涵盖了第3点。
(?!.*(.)\1.*) - 如果字符串中有重复的连续字符，则停止匹配。它会检查任何内容，然后使用capture group记住一个字符，并检查剩余部分以查找捕获组中字符出现的字符串。这包括第4点。

注意：如果第4点意味着字符串中的每个字符都应该是唯一的，那么您可以在.*和(.)之间添加\1。

注意：如果匹配 - 正则表达式处理＆＃34;插入符号＆＃34;回到了字符串的开头。
[A-Za-z,'\-] - ＆＃34;真实＆＃34;匹配开始。字符类符合第1点。
[A-Za-z,'\-\.]* - 第1点和第4点

不确定java正则表达式细节 - 快速谷歌搜索发现这可能是可能的。综合测试工作：

Astring # match
,string # match

.string # does not match

a.- # does not match: there are no two characters from [a-Z]

doesnotmatch  # does not match: double non-consequitive occurrence of 't'

P.S。如果要使用已定义的字符类而不是.，正则表达式可能会进行相当大的优化 - 但这会给答案增加很多视觉上的混乱。

最低和最低最大字符数量正则表达式

2 个答案:

描述

实施例

解释