Question

这是关于Lazy (ungreedy) matching multiple groups using regex的后续问题。我尝试使用该方法，但不是很成功。

我从gitlab API中获取一个字符串并尝试提取所有的回购。回购的名称遵循＆＃34; https://gitlab.example.com/foo/xxx.git＆＃34;。

的格式

到目前为止，如果我尝试这个，它就可以了。

gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\//)

但是添加名称通配符很棘手，我使用上一个问题中的方法：

gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\/(.*?)\.git\"/)

它说使用（。*？）进行延迟匹配，但它似乎不起作用。

非常感谢你的帮助。

Answer 1

如果我们有以下字符串：

gitlab_str = "\"https://gitlab.example.com/foo/xxx.git\""

以下RegEx将返回[["xxx"]]，这是预期的：

gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\/(.*?)\.git\"/)

因为你有(.*?)。注意括号，所以只返回括号内的内容。如果要返回匹配的整个字符串，可以删除括号：

gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\/.*?\.git\"/)

这将返回：

["\"https://gitlab.example.com/foo/xxx.git\""]

它也适用于多次出现：

> gitlab_str = "\"https://gitlab.example.com/foo/xxx.git\" and \"https://gitlab.example.com/foo/yyy.git\""
> gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\/.*?\.git\"/)

=> ["\"https://gitlab.example.com/foo/xxx.git\"", "\"https://gitlab.example.com/foo/yyy.git\""]

最后，如果你想从结果匹配中删除https://部分，那么只需在RegEx中包含()部分以外的所有内容：

gitlab_str.scan(/\"https\:\/\/(gitlab\.example\.com\/foo\/.*?\.git)\"/)

Ruby：如何执行惰性正则表达式匹配？

1 个答案: