Question

我使用以下正则表达式来验证这种需求。

仅限字母数字
可以使用连字符-下划线_斜线/和:
开头和结尾可以有一个空格忽略空格

例如......

aaa:bbb:ccc
aaa/bbb/ccc
 aa-bb
 dd

目前此验证不适用于'：' 我该如何解决？

@"^\s?(?:(?:-?[A-z0-9]+)*|(?:_?[A-z0-9]+)*|(?:\/?[A-z0-9]+/?)*)\s*$"

Answer 1

您可以使用以下正则表达式。

我假设你的评论，你想确保你的字符串中有相同的分隔符。

@"^\s?[a-zA-Z0-9]+(?:([/:_-])[a-zA-Z0-9]+(?:\1[a-zA-Z0-9]+)*)?\s*$"

请参阅Live demo

正则表达式：

^                 # the beginning of the string
\s?               # whitespace (\n, \r, \t, \f, and " ") (optional)
[a-zA-Z0-9]+      # any character of: 'a' to 'z', 'A' to 'Z', '0' to '9' (1 or more times)
(?:               # group, but do not capture (optional)
(                 # group and capture to \1:
 [/:_-]           # any character of: '/', ':', '_', '-'
)                 # end of \1
 [a-zA-Z0-9]+     # any character of: 'a' to 'z', 'A' to 'Z', '0' to '9' (1 or more times)
 (?:              # group, but do not capture (0 or more times)
  \1              # what was matched by capture \1
  [a-zA-Z0-9]+    # any character of: 'a' to 'z', 'A' to 'Z', '0' to '9' (1 or more times)
 )*               # end of grouping
)?                # end of grouping
 \s*              # whitespace (\n, \r, \t, \f, and " ") (0 or more times)
$                 # before an optional \n, and the end of the string

你可以稍微简化一下。

(?i)^\s?[a-z0-9]+(?:([/:_-])[a-z0-9]+(?:\1[a-z0-9]+)*)?\s*$

Answer 2

如果你看一下你目前正在使用的正则表达式，它有一个与分隔符匹配的模式：

//            here                 here                  and here
//            v                    v                     v
@"^\s?(?:  (?:-?[A-z0-9]+)*  |  (?:_?[A-z0-9]+)*  |  (?:\/?[A-z0-9]+/?)*  )\s*$"

请注意，它只有-，_和/ - 冒号无处可寻。所以你可以只用冒号...

添加另一个部分到表达式

但不要这样做。正则表达式重复了很多，你可以轻松地使它更短，更容易理解。人们可以提出许多同样好的选择 - 这是一个例子，基于我的解释，只允许一种分隔符样式，并且分隔符不能位于字符串的开头或结尾：

^\s?[A-Za-z0-9]+(?:([:\/_-])[A-Za-z0-9]+(?:\1[A-Za-z0-9]+)*)?\s*$

说明：

^\s?               # start of string, single optional whitespace
[A-Za-z0-9]+       # MUST start with word characters
(?:
  ([:\/_-])        # capture the delimiter
  [A-Za-z0-9]+     # which must be followed by word characters
  (?:
    \1[A-Za-z0-9]+ # repeated strings of the SAME delimiter
  )*               # 0 or more times
)?                 # and, in fact, the entire delimiter section is optional
\s*$               # optional trailing whitespace

之前，我使用了字符类[A-z0-9]，副作用是一些额外的字符（如_）无意中匹配。事实证明，这是由于如何定义字符类范围 - according to MSDN

字符范围是由一系列定义的连续字符指定系列中的第一个字符，连字符（ - ），然后系列中的最后一个字符。 如果是两个字符是连续的它们具有相邻的Unicode代码点。

事实上，_（以及其他字符）的代码点恰好位于大写和小写代码点之间。

经验教训：字符范围是代码点范围，可能包含您认为不会的内容。

添加验证名称失败

2 个答案: