如何解析ruby中字符串中最后一组括号之间的子字符串

时间:2009-03-28 06:52:39

标签: ruby-on-rails ruby regex

在我的ruby on rails应用程序中,我正在尝试构建一个解析器以从字符串中提取一些元数据。

假设示例字符串是:

  

快速的红狐狸(坦率地,10岁)跳了起来   在懒惰的棕色狗身上(ralph,20)。

我想从()的最后一次出现中提取子串。

所以,无论字符串中有多少(),我都希望得到“ralph,20”。

有没有最好的方法来创建这个ruby字符串提取... regexp?

谢谢,

约翰

3 个答案:

答案 0 :(得分:2)

看起来你想要sexeger。它们通过反转字符串,对字符串运行反向正则表达式,然后反转结果来工作。这是一个例子(原谅代码,我真的不懂Ruby):

#!/usr/bin/ruby

s = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20).";

reversed_s = s.reverse;
reversed_s =~ /^.*?\)(.*?)\(/;
result = $1.reverse;
puts result;

事实上,这是没有获得选票告诉我没有人点击阅读为什么你想使用性别,所以这是基准的结果:

do they all return the same thing?
ralph, 20
ralph, 20
ralph, 20
ralph, 20
                        user     system      total        real
scan greedy         0.760000   0.000000   0.760000 (  0.772793)
scan non greedy     0.750000   0.010000   0.760000 (  0.760855)
right index         0.760000   0.000000   0.760000 (  0.770573)
sexeger non greedy  0.400000   0.000000   0.400000 (  0.408110)

以下是基准:

#!/usr/bin/ruby

require 'benchmark'

def scan_greedy(s)
    result = s.scan(/\([^)]*\)/x)[-1]
    result[1 .. result.length - 2]
end

def scan_non_greedy(s)
    result = s.scan(/\(.*?\)/)[-1]
    result[1 .. result.length - 2]
end

def right_index(s)
    s[s.rindex('(') + 1 .. s.rindex(')') -1]
end

def sexeger_non_greedy(s)
    s.reverse =~ /^.*?\)(.*?)\(/
    $1.reverse
end

s = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20).";

puts "do they all return the same thing?", 
    scan_greedy(s), scan_non_greedy(s), right_index(s), sexeger_non_greedy(s)

n = 100_000
Benchmark.bm(18) do |x|
    x.report("scan greedy")        { n.times do; scan_greedy(s); end }
    x.report("scan non greedy")    { n.times do; scan_non_greedy(s); end }
    x.report("right index")        { n.times do; scan_greedy(s); end }
    x.report("sexeger non greedy") { n.times do; sexeger_non_greedy(s); end }
end

答案 1 :(得分:1)

我会尝试这个(这里我的正则表达式假设第一个值是字母数字,第二个值是数字,相应地调整)。在这里扫描将所有出现的数据作为一个数组,-1告诉我们只抓住最后一个,这似乎正是你所要求的:

>> foo = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
=> "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
>> foo.scan(/\(\w+, ?\d+\)/)[-1]
=> "(ralph, 20)"

答案 2 :(得分:1)

一个简单的非正则表达式解决方案:

string = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
string[string.rindex('(')..string.rindex(')')]

示例:

irb(main):001:0> string =  "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
=> "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
irb(main):002:0> string[string.rindex('(')..string.rindex(')')]
=> "(ralph, 20)"

没有括号:

irb(main):007:0> string[string.rindex('(')+1..string.rindex(')')-1]
=> "ralph, 20"