如何忽略正则表达式中URL周围的字符

时间:2016-07-09 12:50:37

标签: javascript regex

我有以下正则表达式

$document.bind("keypress", function(e) {
    if (e.shiftKey && e.keyCode == 191 &&
        !$(e.target).is("input, textarea") // <=== The check
        ) {
       // It did
    if (e.shiftKey && e.keyCode == 191) {...}
});

我能够正确捕获以下网址:

var URL_REGEX = /(^|[\s\n]|<br\/?>)((?:(?:https?|ftp):\/\/)?[\-A-Z0-9\u00A0-\uD7FF\uE000-\uFDCF\uFDF0-\uFFFD+\u0026\u2019@#\/%?=()~_|!:,.;]*[\-A-Z0-9+\u0026@#\/%=~()_|])/gi;

但假设我有

var someString1 = "hello http://stackoverflow.com";
var someString2 = "hello www.stackoverflow.com";
var someString3 = "hello stackoverflow.com";
var someString4 = "hello stackoverflow.com?foo=bar&foo=baz&foo-bar=baz";

我捕获了URL和括号(我不想要)。我如何只捕获URL?

无法捕获。我得不到匹配:

var wrappedUrl = "hello (www.stackoverflow.com)";

2 个答案:

答案 0 :(得分:2)

您可以使用

/((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,4})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/gi

请参阅regex demo

<强>解释

  • ((https?|ftp)\:\/\/)? - Scheme
  • ([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)? - 用户名和密码
  • ([a-z0-9-.]*)\.([a-z]{2,3}) - 主机名或IP地址
  • (\:[0-9]{2,5})? - 端口地址
  • (\/([a-z0-9+\$_-]\.?)+)*\/? - 路径
  • (\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)? - 获取查询
  • (#[a-z_.-][a-z0-9+\$_.-]*)? - 锚

参见JS演示:

&#13;
&#13;
var re = /((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,4})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/gi; 
var str = `hello http://stackoverflow.com
hello www.stackoverflow.com
hello stackoverflow.com
hello stackoverflow.com?foo=bar&foo=baz&foo-bar=baz
hello [www.stackoverflow.com]
hello (www.stackoverflow.com)`;
 
while ((m = re.exec(str)) !== null) {
    document.body.innerHTML += m[0] + "<br/>";
}
&#13;
&#13;
&#13;

答案 1 :(得分:0)

我尝试了这个正则表达式/((http|https|ftp):?\/\/)?[a-z-A-Z]*(\.[a-z-A-Z]*)+(\?([a-z-A-Z0-9_]+=[a-z-A-Z0-9_]+(&)?)*)?/
并且它在您展示的所有情况下都能完美运行 无论如何,查看RegExp引用会很好,并且可以自己尝试从空白构建表达式。