Regex Way

Question

我的网址始终采用此格式

http://domain.tld/foo/bar/boo

http://www.domain.tld/foo/bar/boo

http://sub.domain.tld/foo/bar/boo

http://www.sub.domain.tld/foo/bar/boo

我想使用正则表达式从网址中提取bar，无论格式如何。

我正在使用JavaScript。

我试图用

之类的东西打破网址

var x = 'http://domain.tld/foo/bar/boo`'
x.split(/^((http[s]?):\/)?\/?([^:\/\s]+)((\/\w+)*\/)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$/g)

但这并没有真正起作用，也没有帮助，因为当我真正需要bar的值时，我似乎得到了一个或多个项目

Answer 1

var el = document.createElement('a');
el.href = "http://www.domain.tld/foo/bar/boo";
var importantPart = el.pathname.split('/')[2];
console.log(importantPart);

小提琴：https://jsfiddle.net/dcyo4ph5/1/

来源：https://css-tricks.com/snippets/javascript/get-url-and-url-parts-in-javascript/＆amp; JavaScript - Get Portion of URL Path

我猜这不会使用正则表达式。这可能不是你想要的。

Answer 2

拆分和切片会像这样简单，dateFormat创建一个数组，S将选择最后两个split('/')的第一个slice(-2)[0]。

使用[0]，您可以删除任何尾随斜杠（在下面的第4个示例中显示）

Stack snippet

(-2)

或者只是replace(/\/$/, "")数组并得到第二项（var x = 'http://domain.tld/foo/bar/boo' console.log( x.split('/').slice(-2)[0] ); var x = 'http://www.sub.domain.tld/foo/bar/boo' console.log( x.split('/').slice(-2)[0] ); var x = 'http://www.domain.tld/foo/bar/boo' console.log( x.split('/').slice(-2)[0] ); // and this one will trim trailing slash var x = 'http://www.domain.tld/foo/bar/boo/' console.log( x.replace(/\/$/, "").split('/').slice(-2)[0] );，因为数组基于零）

reverse

Answer 3

我将列出正则表达式和非正则表达式。令人惊讶的是，正则表达式似乎更短。

Regex Way

查找条形图和嘘声的正则表达式是/.*\/(.*)\/(.*)$/，它简短，精确且完全符合您的需要。

让我们付诸实践，

const params = "http://www.sub.domain.tld/foo/bar/boo".match(/.*\/(.*)\/(.*)$/)

结果，

params;
["http://www.sub.domain.tld/foo/bar/boo","bar","boo"]

只需访问params[0]和params[1]即可。

正则表达式说明：

扩展版本：

正则表达式可以进一步扩展以使用这样的结束斜杠来抓取/bar/foo/模式，

.*\/\b(.*)\/\b(.*)(\/?)$

这意味着，

可以进一步扩展，但现在让我们保持简单。

非正则表达方式

使用.split()，

等原生方法

function getLastParam(str, targetIndex = 1) {
  const arr = str
                .split("/") // split by slash
                .filter(e=>e); // remove empty array elements
  return arr[arr.length - targetIndex];
}

让我们快速测试不同的情况

[
  "http://domain.tld/foo/bar/boo",
  "http://www.domain.tld/foo/bar/boo",
  "http://sub.domain.tld/foo/bar/boo",
  "http://www.sub.domain.tld/foo/bar/boo",
  "http://domain.tld/foo/bar/boo/",
  ".../bar/boo"
].map(e => {
  console.log({ input: e, output: getLastParam(e, 1) });
});

这将在下面产生，

{input: "http://domain.tld/foo/bar/boo", output: "boo"}
{input: "http://www.domain.tld/foo/bar/boo", output: "boo"}
{input: "http://sub.domain.tld/foo/bar/boo", output: "boo"}
{input: "http://www.sub.domain.tld/foo/bar/boo", output: "boo"}
{input: "http://domain.tld/foo/bar/boo/", output: "boo"}
{input: ".../bar/boo", output: "boo"}

如果您需要bar，请使用2代替targetIndex。它将获得倒数第二名。在这种情况下，getLastParam(str, 2)会产生bar。

极速东西

以下是小型基准测试内容http://jsbench.github.io/#a6bcecaa60b7d668636f8f760db34483

getLastParamNormal: 5,203,853 ops/sec
getLastParamRegex: 6,619,590 ops/sec

嗯，这没关系。但是，它很有趣。

Answer 4

你不需要正则表达式。 Anchor元素有一个API，可以为您分解URL。然后，您可以拆分pathname以获取路径

＆＃13;

function parse(path) {
  let a = document.createElement('a');
  a.href = path;

  return a.pathname.split('/')[2];
}

console.log(parse('http://domain.tld/foo/bar/boo'));
console.log(parse('http://www.domain.tld/foo/bar/boo'));
console.log(parse('http://sub.domain.tld/foo/bar/boo'));
console.log(parse('http://www.sub.domain.tld/foo/bar/boo'));

＆＃13;

使用Regex提取部分URL

4 个答案:

Regex Way

非正则表达方式

极速东西