Question

在之前的post中，我曾要求在没有否定的情况下重写正则表达式方面提供一些帮助

启动正则表达式：

https?:\/\/(?:.(?!https?:\/\/))+$

结束于：

https?:[^:]*$

这样可以正常运行，但我注意到，除了来自http \ s的:之外，我的网址中会有:，但它不会选择。

这是一个无效的字符串：

sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2

您可以注意到:query2

如何修改此处列出的第二个正则表达式，以便选择包含:的网址。

预期产出：

http://websites.com/path/subpath/cc:query2

此外，我想在?=param

首次出现之前选择所有内容

输入： sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param

输出：

http://websites.com/path/subpath/cc:query2/text/

Answer 1

遗憾的是，Go正则表达式不支持外观。但是，您可以通过一种技巧获取最后一个链接：贪婪地匹配所有可能的链接和其他字符，并捕获与捕获组的最后一个链接：

^(?:https?://|.)*(https?://\S+?)(?:\?=|$)

与\S*?懒惰空白匹配一起，这也可以捕获到?=的链接。

请参阅regex demo和Go demo

var r = regexp.MustCompile(`^(?:https?://|.)*(https?://\S+?)(?:\?=|$)`)
fmt.Printf("%q\n", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2", -1)[0][1])
fmt.Printf("%q\n", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param", -1)[0][1])

结果：

"http://websites.com/path/subpath/:query2"
"http://websites.com/path/subpath/cc:query2/text/"

如果最后一个链接中可以有空格，请仅使用.+?：

^(?:https?://|.)*(https?://.+?)(?:\?=|$)

写正则表达式没有否定

1 个答案: