我遇到正则表达式问题,无法从网址获取主域名。那就是如果我有下面给出的网址..
http://domain.com/return/java.php?hello.asp
http://www.domain.com/return/java.php?hello.asp
http://blog.domain.net/return/java.php?hello.asp
http://us.blog.domain.co.us/return/java.php?hello.asp
http://domain.co.uk
http://domain.net
http://www.blog.domain.co.ca/return/java.php?hello.asp
http://us.domain.com/return/
从这一切我只应该域作为正则表达式的输出.. 那怎么办呢? 我用过;
var url = urls.match(/[^.]*.(com|net|org|info|coop|int|co\.uk|org\.uk|ac\.uk|uk)/g);
但它不适用于
http://domain.net
所以有人可以帮我解决这个问题。
答案 0 :(得分:3)
您可以使用URL
而不是正则表达式
var url = new URL("http://domain.com/return/java.php?hello.asp");
console.log(url.hostname);
=> domain.com
或强>
如果你也想要协议
var url = new URL("http://domain.com/return/java.php?hello.asp");
console.log(url.protocol+"//"+url.hostname);
= > http://domain.com
答案 1 :(得分:0)
(http|https|ftp):\/\/([a-zA-Z0-9.])+/g
匹配
http://domain.com
http://www.domain.com
http://blog.domain.net
http://us.blog.domain.co.us
http://domain.co.uk
http://domain.net
http://www.blog.domain.co.ca
http://us.domain.com
答案 2 :(得分:0)
这是一个改变正则表达式的解决方案:
url.match(/https?:\/\/[^/]+((?=\/)|$)/g);
//tested with Chrome 38+ on Win7
基本检查斜杠/
或字符串结尾$
使用内联Stackoverflow代码替换jsFiddle链接代码:
var urls = ['http://domain.com/return/java.php?hello.asp',
'http://www.domain.com/return/java.php?hello.asp',
'http://blog.domain.net/return/java.php?hello.asp',
'http://us.blog.domain.co.us/return/java.php?hello.asp',
'http://domain.co.uk',
'http://domain.net',
'http://www.blog.domain.co.ca/return/java.php?hello.asp',
'http://us.domain.com/return/'
];
var htmlConsole = document.getElementById("result");
var htmlTab = " ";
var htmlNewLine = "<br />";
htmlConsole.innerHTML = "";
for (var id in urls) {
htmlConsole.innerHTML += "URL: " + urls[id] + htmlNewLine;
var matchResults = urls[id].match(/https?:\/\/[^/]+((?=\/)|$)/g);
for (var innerIdx in matchResults) {
htmlConsole.innerHTML += htmlTab + "MatchNumber: " + innerIdx + " MatchValue: " + matchResults[innerIdx] + htmlNewLine;
}
htmlConsole.innerHTML += htmlNewLine;
}
<div id="result">
</div>
答案 3 :(得分:-1)
var url = urls.match(/[^./]*.(com|net|org|info|coop|int|co\.uk|co\.us|co\.ca|org\.uk|ac\.uk|uk)/g);
刚刚添加了/
并更新了顶级域名列表以符合您的示例
虽然我不建议将顶级域列表保留在regexp中。它太多了。 http://en.wikipedia.org/wiki/List_of_Internet_top-level_domains