我在博客网站上使用正则表达式模式将URL地址设为可点击链接,效果很好。该模式具有以下格式:
/(href=")?([-a-zA-Z0-9@:%_\+.~#?&\/\/=]{2,256}\.[a-z]{2,4}\b(\/?[-a-zA-Z0-9@:%_\+.~#?&\/\/=]+)?)/
但是在不久的将来我发现这个模式也匹配文件名,所以当用户在评论中发布一些文件名时,系统会将其作为链接。你可以在这里看到这个效果:
我想要实现的是匹配除最后一个示例之外的所有这些网址格式(请参见下图),因此mysite.com
或filename.php
不会突出显示。
输入应匹配的内容:
+--------------------------+------------------------------------------------------+
| Example | Explanation |
+--------------------------+------------------------------------------------------+
| http(s)://www.mysite.com | because it starts with http(s):// and has URL format |
| www.mysite.com | because it starts with www. and has URL format |
+--------------------------+------------------------------------------------------+
输入不应匹配的内容:
+-------------------+--------------------------------------------------+
| Example | Explanation |
+-------------------+--------------------------------------------------+
| mysite.com | because it doesn't start with http(s):// or www. |
| | even it has URL format |
| http(s)://mytext | because it doesn't have URL format |
| http://localhost/ | because it doesn't have URL format |
+-------------------+--------------------------------------------------+
网址格式的外观如何?
对于这种情况,我们可以通过以下模式指定URL格式:
([-a-zA-Z0-9_.]{2,256}\.[a-z]{2,4}\b(\/?[-a-zA-Z0-9:%_\+.~#?&\/=]+)?))
示例:
google.com, google.co.uk, accounts.google.com, google.com/somepath/ ...
尝试在此模式中添加www\.
字符串,但未找到匹配项。那么如何编辑此正则表达式以匹配以' www'开头的网址?或者' http(s)://'没有别的?
提前致谢。
答案 0 :(得分:1)
这个正则表达式绝对不完美,但是will do what you want:
(http[s]?:\/\/|www.|ftp:\/\/){1,2}([-a-zA-Z0-9_]{2,256}\.[a-z]{2,4}\b(\/?[-a-zA-Z0-9@:%_\+.~#?&\/=]+)?)
可以欺骗非网址,但这不会被滥用。提高智慧会大大增加复杂性。