Question

我有以下正则表达式：

a=/item\/([0-9]+)\)/.match("item/123) and also item/245)")

我试图提取字符串中所有链接的项ID，例如：

 [123,245]

但它返回

<MatchData "item/123)" 1:"123">

（即只有第一个）。如何让它返回两个id（作为两个MatchData的一部分或通过其他方法）？我想我需要指定贪心但不确定。

Answer 1

不捕捉不必要的东西：

"item/123) and also item/245)".scan(%r{(?<=item/)\d+(?=\))})
# => ["123", "245"]

Answer 2

您可以使用scan执行以下操作：

如果模式不包含任何组，则每个结果都由匹配的字符串$＆amp;组成。 如果模式包含组，则每个单独的结果本身就是一个包含每个组一个条目的数组。

"item/123) and also item/245)".scan(/item\/([0-9]+)\)/).flatten
# => ["123", "245"]
s = "item/123) and also item/245)"
s.scan(/item\/([0-9]+)\)/).flatten.map(&:to_i) # to get them as integers
# => [123, 245]

由于您创建了单个捕获组，因此只得到一个结果：

a = /item\/([0-9]+)\)/.match("item/123) and also item/245)")
a.captures # => ["123"]

另请参阅方法captures。

Answer 3

如果你想要的只是id，并且文本中没有其他数字，你可以大大简化你的模式：

"item/123) and also item/245)".scan(/\d+/) # => ["123", "245"]

或者，如果你想要整数而不是字符串：

"item/123) and also item/245)".scan(/\d+/).map(&:to_i) # => [123, 245]

如果你想要避免某些数字，你可以测试＆＃34; index /＆＃34;接下来是一个数字，使用@sawa建议的正面观察，或使用这些方面的东西：

"item/123) and also item/245)".scan(%r[item/\d+]).map{ |s| s[/\d+/] } # => ["123", "245"]

或者：

"item/123) and also item/245)".scan(%r[item/\d+]).map{ |s| s[/\d+/].to_i } # => [123, 245]

打破这一点，以便你可以看到发生了什么：

"item/123) and also item/245)".scan(%r[item/\d+]) # => ["item/123", "item/245"]
"item/123"[/\d+/] # => "123"

就我个人而言，我喜欢@ sawa的回答，因为它简洁明了，只能找到＆＃34; item /＆＃34;之后的数字。并避免需要强制后续flatten的捕获组。我只想展示如何使用更简单的模式来实现它。

Answer 4

您可以使用scan方法。

a = "item/123) and also item/245)".scan(/item\/([0-9]+)\)/)

返回：

[["123"], ["245"]]

不确定为什么正则表达式不起作用

4 个答案: