Ruby String#scan等效于返回MatchData

时间:2012-03-02 04:34:50

标签: ruby regex string parsing pattern-matching

正如问题标题中基本所述,Ruby字符串上是否有一个等价于String#Scan的方法,但它不会返回每个匹配的列表,而是返回MatchData的数组S'例如:

# Matches a set of characters between underscore pairs
"foo _bar_ _baz_ hashbang".some_method(/_[^_]+_/) #=> [#&ltMatchData "_bar_"&rt, &ltMatchData "_baz_"&rt]

或者我可以获得相同或类似结果的任何方式都会很好。我想这样做,以找到Ruby字符串中“字符串”的位置和范围,例如"goodbye"world"里面的''再见'残酷的'世界'。“

3 个答案:

答案 0 :(得分:12)

memo = []
"foo _bar_ _baz_ hashbang".scan(/_[^_]+_/) { memo << Regexp.last_match }
 => "foo _bar_ _baz_ hashbang"
memo
 => [#<MatchData "_bar_">, #<MatchData "_baz_">]

答案 1 :(得分:7)

您可以通过利用MatchData#endString#matchpos参数轻松构建自己的内容。像这样:

def matches(s, re)
  start_at = 0
  matches  = [ ]
  while(m = s.match(re, start_at))
    matches.push(m)
    start_at = m.end(0)
  end
  matches
end

然后:

>> matches("foo _bar_ _baz_ hashbang", /_[^_]+_/)
=> [#<MatchData "_bar_">, #<MatchData "_baz_">]
>> matches("_a_b_c_", /_[^_]+_/)
=> [#<MatchData "_a_">, #<MatchData "_c_">]
>> matches("_a_b_c_", /_([^_]+)_/)
=> [#<MatchData "_a_" 1:"a">, #<MatchData "_c_" 1:"c">]
>> matches("pancakes", /_[^_]+_/)
=> []

如果你真的想要,你可以将它修补成字符串。

答案 2 :(得分:1)

如果您不需要返回MatchData,则可以使用StringScanner

require 'strscan'

rxp = /_[^_]+_/
scanner = StringScanner.new "foo _barrrr_ _baz_ hashbang"
match_infos = []
until scanner.eos?
  scanner.scan_until rxp
  if scanner.matched?
    match_infos << {
      pos: scanner.pre_match.size,
      length: scanner.matched_size,
      match: scanner.matched
    }
  else
    break
  end
end

p match_infos
# [{:pos=>4, :length=>8, :match=>"_barrrr_"}, {:pos=>13, :length=>5, :match=>"_baz_"}]