如何使用javascript正则表达式从字符串中获取域

时间:2014-08-15 08:04:57

标签: javascript regex string string-matching capturing-group

正如标题所示,我试图使用javascript正则表达式从字符串中检索域。

请使用以下字符串:

String                                  ==>     Return
"google"                                ==>     null
"google.com"                            ==>     "google.com"
"www.google.com"                        ==>     "www.google.com"
"ftp://ftp.google.com"                  ==>     "ftp.google.com"
"http://www.google.com"                 ==>     "www.google.com"
"http://www.google.com/"                ==>     "www.google.com"
"https://www.google.com/"               ==>     "www.google.com"
"https://www.google.com.sg/"            ==>     "www.google.com.sg"
"https://www.google.com.sg/search/"     ==>     "www.google.com.sg"
"*://www.google.com.sg/search/"         ==>     "www.google.com.sg"

我已经阅读了"Regex to find domain name without www - Stack Overflow""Extract root domain name from string - Stack Overflow",但它们太复杂了,所以我尝试编写自己的正则表达式:

var re = new RegExp("[\\w]+[\\.\\w]+");
/[\w]+[\.\w]+/
re.exec(document.URL);

适用于"google.com""www.google.com""www.google.com.sg",但http"http://google.com/"等返回"http://www.google.com/"

由于我不熟悉正则表达式,我似乎无法弄清楚错误......有什么想法吗?

提前致谢!

2 个答案:

答案 0 :(得分:8)

使用此正则表达式:

/(?:[\w-]+\.)+[\w-]+/

这是regex demo

采样:

>>> var regex = /(?:[\w-]+\.)+[\w-]+/
>>> regex.exec("google.com")
... ["google.com"]
>>> regex.exec("www.google.com")
... ["www.google.com"]
>>> regex.exec("ftp://ftp.google.com")
... ["ftp.google.com"]
>>> regex.exec("http://www.google.com")
... ["www.google.com"]
>>> regex.exec("http://www.google.com/")
... ["www.google.com"]
>>> regex.exec("https://www.google.com/")
... ["www.google.com"]
>>> regex.exec("https://www.google.com.sg/")
... ["www.google.com.sg"]

答案 1 :(得分:1)

您可以在Javascript中使用此正则表达式:

\b(?:(?:https?|ftp):\/\/)?([^\/\n]+)\/?

RegEx Demo