Question

我有一个ruby应用程序从字符串解析一堆URL：

@text = "a string with a url http://example.com"

@text.split.grep(/http[s]?:\/\/\w/)

@text[0] = "http://example.com"

这很好用^^

但有时URL在HTTP：//之前有文本，例如

@text = "What's a spacebar? ...http://example.com"

@text[0] = "...http://example.com"

是否有正则表达式可以在字符串中选择“http：//”之前的文本，以便将其删除？

Answer 1

实现相同结果的更好方法可能是使用URI标准库。

require 'uri'
text = "a string with a url http://example.com and another URL here:http://2.example.com and this here"
URI.extract(text, ['http', 'https'])
# => ["http://example.com", "http://2.example.com"]

文档：URI.extract

Answer 2

分裂然后进行grepping是一种奇怪的方法。为什么不使用String#scan：

@text = "a string with a url http://example.com"
urls = @text.scan(/http[s]?:\/\/\S+/)
url[0]  # => "http://example.com"

Answer 3

.*(?=http://)

Answer 4

或者你可以将两者结合起来。

.*(?=(f|ht)tp[s]://)

Answer 5

只需搜索http：//，然后在此之前删除字符串的部分（因为=〜将偏移量返回到字符串中）

正则表达式删除“http：//”之前的文本？

5 个答案: