正则表达式可识别多个站点的用户名

时间:2019-01-08 10:22:04

标签: javascript regex

我正在尝试创建一个正则表达式来识别多个站点中的用户名。

有多个站点可能是域名,我可能会添加更多站点。

从那里,我正在域名后或域名/标签/ @ xxxxxxxx之后直接寻找@xxxxxxx。用户名后面可能有未知数量的字符,有时我不关心的后面可能会有/和未知内容。

我基本上是在浏览带有/不带有http / https的域列表,然后查看@字母数字的第一个或第二个位置,直到下一个/或为空。

示例网址:

https://site1.com/@bob
https://site2.com/boats/@frank/how-to-fix-your-boat
http://site2.com/@frank/settings
site1.com/@joe.beans/re-how-to-fix-your-boat-248435252

我正在从可能显示的每种类型的网址中识别@username。

我将维护一个正在搜索的域的列表,其中一些可能会添加。我将使用JS迭代列表并填写正则表达式的那部分。

我相信正则表达式将是最快的方法,除非有其他可用的chrome扩展更容易。

1 个答案:

答案 0 :(得分:1)

您可以使用正则表达式/(.+)\/@([^\/\r\n]+)/来捕获网站和用户名,如下所示:

var re = /(.+)\/@([^\/\r\n]+)/;
var match = re.exec(url);
if (match != null) {
    site = match[1];
    user = match[2];
}

使用示例:

'https://site1.com/@bob'                                  --> site = "https://site1.com";       user = "bob"
'https://site2.com/boats/@frank/how-to-fix-your-boat'     --> site = "https://site2.com/boats"; user = "frank"
'http://site2.com/@frank/settings'                        --> site = "http://site2.com";        user = "frank"
'site1.com/@joe.beans/re-how-to-fix-your-boat-248435252'  --> site = "site1.com";               user = "joe.beans"


编辑

如果您想捕获协议,域和用户,则应该这样做:

var re = /^((?:http|ftp)s?:\/\/)?(?:www\.)?([^@\/\r\n]+)?(?:\/.+)?\/@([^\/\r\n]+)/;
var match = re.exec(url);
if (match != null) {
    protocol = match[1];
    domain   = match[2];
    user     = match[3];
}

这将产生:

url                                                         match[1]  match[2]   match[3]
---                                                         --------  --------   --------
https://site1.com/@bob                                  --> https://  site1.com  bob
https://site2.com/boats/@frank/how-to-fix-your-boat     --> https://  site2.com  frank
http://site2.com/@frank/settings                        --> http://   site2.com  frank
site1.com/@joe.beans/re-how-to-fix-your-boat-248435252  -->           site1.com  joe.beans

正则表达式详细信息

"^" +                Assert position at the beginning of a line (at beginning of the string or after a line break character) (line feed, line feed, line separator, paragraph separator)
"(" +                Match the regex below and capture its match into backreference number 1
   "(?:" +           Match the regular expression below
                     Match this alternative (attempting the next alternative only if this one fails)
         "http" +    Match the character string “http” literally (case insensitive)
      "|" +
                     Or match this alternative (the entire group fails if this one fails to match)
         "ftp" +     Match the character string “ftp” literally (case insensitive)
   ")" +
   "s" +             Match the character “s” literally (case insensitive)
      "?" +          Between zero and one times, as many times as possible, giving back as needed (greedy)
   ":" +             Match the character “:” literally
   "\\/" +           Match the character “/” literally
   "\\/" +           Match the character “/” literally
")" +
   "?" +             Between zero and one times, as many times as possible, giving back as needed (greedy)
"(?:" +              Match the regular expression below
   "www" +           Match the character string “www” literally (case insensitive)
   "\\." +           Match the character “.” literally
")" +
   "?" +             Between zero and one times, as many times as possible, giving back as needed (greedy)
"(" +                Match the regex below and capture its match into backreference number 2
   "[^" +            Match any single character NOT present in the list below
      "@" +          The literal character “@”
      "\\/" +        The literal character “/”
      "\r" +         The carriage return character
      "\n" +         The line feed character
   "]" +
      "+" +          Between one and unlimited times, as many times as possible, giving back as needed (greedy)
")" +
   "?" +             Between zero and one times, as many times as possible, giving back as needed (greedy)
"(?:" +              Match the regular expression below
   "/" +             Match the character “/” literally
   "." +             Match any single character that is NOT a line break character (line feed, carriage return, line separator, paragraph separator)
      "+" +          Between one and unlimited times, as many times as possible, giving back as needed (greedy)
")" +
   "?" +             Between zero and one times, as many times as possible, giving back as needed (greedy)
"/@" +               Match the character string “/@” literally
"(" +                Match the regex below and capture its match into backreference number 3
   "[^" +            Match any single character NOT present in the list below
      "\\/" +        The literal character “/”
      "\r" +         The carriage return character
      "\n" +         The line feed character
   "]" +
      "+" +          Between one and unlimited times, as many times as possible, giving back as needed (greedy)
")"