在Ruby中用组名替换正则表达式

时间:2012-12-09 11:53:37

标签: ruby regex

有没有办法使用Ruby中的组名使用分组正则表达式执行替换?

这是我到目前为止所得到的(但是你会发现它缺少一些有价值的背景,在相当常见的情况下渲染是无用的):

class String

    def scan_in_groups( regexp )
        raise ArgumentError, 'Regexp does not contain any names.' if regexp.names.empty?

        captures = regexp.names.inject( {} ){ |h, n| h[n] = []; h }

        scan( regexp ).each do |match|
            captures.keys.zip( match ).each do |group, gmatch|
                next if !gmatch
                captures[group] << gmatch
            end
        end

        captures.reject { |_, v| v.empty? }
    end

    def sub_in_groups( regexp, group_hash )
        dup.sub_in_groups!( regexp, group_hash )
    end

    def sub_in_groups!( regexp, group_hash )
        scan_in_groups( regexp ).each do |name, value|
            next if !group_hash[name]
            sub!( value.first, group_hash[name] )
        end
        self
    end

end

regexp = /
    \/(?<category>\w+)         # matches category type
    \/                         # path separator
    (?<book-id>\d+)            # matches book ID numbers
    \/                         # path separator
    .*                         # irrelevant
    \/                         # path separator
    chapter-(?<chapter-id>\d+) # matches chapter ID numbers
    \/                         # path separator
    stuff(?<stuff-id>\d+)      # matches stuff ID numbers
/x

path = '/book/12/blahahaha/test/chapter-3/stuff4/12'

p path.scan_in_groups( regexp )
#=> {"category"=>["book"], "book-id"=>["12"], "chapter-id"=>["3"], "stuff-id"=>["4"]}

update = {
    'category'   => 'new-category',
    'book-id'    => 'new-book-id',
    'chapter-id' => 'new-chapter-id',
    'stuff-id'   => '-new-stuff-id'
}

p path.sub_in_groups( regexp, update )
#=> "/new-category/new-book-id/blahahaha/test/chapter-new-chapter-id/stuff-new-stuff-id/12"

p '/12/book/12/blahahaha/test/chapter-3/stuff4/12'.sub_in_groups( regexp, update )
#=> /new-book-id/new-category/12/blahahaha/test/chapter-new-chapter-id/stuff-new-stuff-id/12

我需要的是一个保留Regexp匹配上下文的解决方案,并且可以代替它们,以便最终结果如下:

#=> /12/new-category/new-book-id/blahahaha/test/chapter-new-chapter-id/stuff-new-stuff-id/12

这可能吗?

2 个答案:

答案 0 :(得分:0)

要改变的词语是否相同?

replacements = [ ["category", "new-category"], ["book-id", "new-book-id"], ["chapter-id", "new-chapter-id"], ["stuff-id", "-new-stuff-id"] ]
replacements.each {|replacement| str.gsub!(replacement[0], replacement[1])}

答案 1 :(得分:0)

这样做的一种方式就是这样

def substitute!(regexp, string,updates)
  if match = regexp.match(string)
    keys_in_order = updates.keys.sort_by {|k| match.offset(k)}.reverse
    keys_in_order.each do |k|
      offsets_for_group = match.offset(k)
      string[offsets_for_group.first...offsets_for_group.last] = updates[k]
    end
  end
end

这会修改字符串。

当您拥有匹配数据时,match.offset(capture_name)将返回该组的开始和结束偏移量,然后此代码将用于执行更新。您需要先从字符串末尾开始进行替换,以便它们不会移动偏移量。

如果您只需要更改一个组,则可以

x = "/foo/bar/baz"
x[/(?<group>bar)/, 'group'] = 'new'
# x is now '/foo/bar/baz'