C#中的URL正则表达式

时间:2013-03-03 21:16:35

标签: c# regex url-validation

我正在尝试找到一个可以验证尽可能多的网址的正则表达式。我在MVC3的输入字段中使用它:

[RegularExpression(@"expression...")]

我发现了this regular expression - "diegoperini",我很喜欢它,但我不知道如何从php转换为.net版本:

_^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})
(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})
(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])
(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))
|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.
(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*
(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$_iuS

它在网上看起来怎么样?

1 个答案:

答案 0 :(得分:8)

您必须删除PHP分隔符并使用\ u0000更改\ x {0000} -patterns。
所以正则表达式应该是这样的。

String sourcestring = "source string to match with pattern";
Regex re = new Regex(@"^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$",RegexOptions.IgnoreCase | RegexOptions.Multiline);
Match m = re.Match(sourcestring);
for (int gIdx = 0; gIdx < m.Groups.Count; gIdx++)
{
   Console.WriteLine("[{0}] = {1}", re.GetGroupNames()[gIdx], m.Groups[gIdx].Value);
}

如果需要,您可以看到实时示例over here

简单说明:此模式将匹配完整的URL(包括用户名,密码,端口,路径,查询,片段),但它只对域部分进行详细验证;其他部分经过了彻底的验证。 (感谢@nhahtdh提供线索)