在我的ruby on rails应用程序中,我正在尝试构建一个解析器以从字符串中提取一些元数据。
假设示例字符串是:
快速的红狐狸(坦率地,10岁)跳了起来 在懒惰的棕色狗身上(ralph,20)。
我想从()的最后一次出现中提取子串。
所以,无论字符串中有多少(),我都希望得到“ralph,20”。
有没有最好的方法来创建这个ruby字符串提取... regexp?
谢谢,
约翰
答案 0 :(得分:2)
看起来你想要sexeger。它们通过反转字符串,对字符串运行反向正则表达式,然后反转结果来工作。这是一个例子(原谅代码,我真的不懂Ruby):
#!/usr/bin/ruby
s = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20).";
reversed_s = s.reverse;
reversed_s =~ /^.*?\)(.*?)\(/;
result = $1.reverse;
puts result;
事实上,这是没有获得选票告诉我没有人点击阅读为什么你想使用性别,所以这是基准的结果:
do they all return the same thing?
ralph, 20
ralph, 20
ralph, 20
ralph, 20
user system total real
scan greedy 0.760000 0.000000 0.760000 ( 0.772793)
scan non greedy 0.750000 0.010000 0.760000 ( 0.760855)
right index 0.760000 0.000000 0.760000 ( 0.770573)
sexeger non greedy 0.400000 0.000000 0.400000 ( 0.408110)
以下是基准:
#!/usr/bin/ruby
require 'benchmark'
def scan_greedy(s)
result = s.scan(/\([^)]*\)/x)[-1]
result[1 .. result.length - 2]
end
def scan_non_greedy(s)
result = s.scan(/\(.*?\)/)[-1]
result[1 .. result.length - 2]
end
def right_index(s)
s[s.rindex('(') + 1 .. s.rindex(')') -1]
end
def sexeger_non_greedy(s)
s.reverse =~ /^.*?\)(.*?)\(/
$1.reverse
end
s = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20).";
puts "do they all return the same thing?",
scan_greedy(s), scan_non_greedy(s), right_index(s), sexeger_non_greedy(s)
n = 100_000
Benchmark.bm(18) do |x|
x.report("scan greedy") { n.times do; scan_greedy(s); end }
x.report("scan non greedy") { n.times do; scan_non_greedy(s); end }
x.report("right index") { n.times do; scan_greedy(s); end }
x.report("sexeger non greedy") { n.times do; sexeger_non_greedy(s); end }
end
答案 1 :(得分:1)
我会尝试这个(这里我的正则表达式假设第一个值是字母数字,第二个值是数字,相应地调整)。在这里扫描将所有出现的数据作为一个数组,-1告诉我们只抓住最后一个,这似乎正是你所要求的:
>> foo = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
=> "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
>> foo.scan(/\(\w+, ?\d+\)/)[-1]
=> "(ralph, 20)"
答案 2 :(得分:1)
一个简单的非正则表达式解决方案:
string = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
string[string.rindex('(')..string.rindex(')')]
示例:
irb(main):001:0> string = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
=> "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
irb(main):002:0> string[string.rindex('(')..string.rindex(')')]
=> "(ralph, 20)"
没有括号:
irb(main):007:0> string[string.rindex('(')+1..string.rindex(')')-1]
=> "ralph, 20"