正则表达式为一组字符串

时间:2014-02-24 19:36:17

标签: ruby-on-rails ruby regex string grep

嗨大家我正在使用ruby 1.8.7,我需要一个正则表达式将一个字符串拆分成单词。 以下是示例

SSAI @ 37C final OD=2.3 @ 16.6h, 492mg/3L, 0% soluble
ss autoinduction @37C overnight, OD=2.1@16.6hrs, 487mg/3L, 70%soluble
AF 0.3mM IPTG induce@10C 24hrs OD=0.2@5.6h, 19mg/3L, 50%soluble
AF, 0.3mM IPTG @ 37C, IND @ OD 0.55 @ 4hrs, 476mg/12L, 0% soluble

这是拆分(相对于第一个例子)

  • SSAI(everything before the first @)
  • 37(integers after the first @)
  • 2.3(float before the second @)
  • 16.6(float after the second @)
  • 492 and 3(pattern mg/*L)
  • 0(before the %)

我有一组具有相同模式的字符串,我想运行正则表达式并导入到数据库中。

4 个答案:

答案 0 :(得分:1)

这似乎可以解决所有输入问题。想法是分两步执行操作。首先按'/ @ /'拆分字段,保存第一个字段以供日后使用,然后在其余字段中搜索小数。

first, *rest = str.split(/@/)
rest.map!{|s| s.scan(/\d+\.?\d*/)}.flatten!

first
#=> "SSAI "

rest
#=> ["37", "2.3", "16.6", "492", "3", "0"]

完整示例:

def extract(source)
  first, *rest = str.split(/@/)
  rest.map!{|s| s.scan(/\d+\.?\d*/)}.flatten!
  [first, *rest]
end

input = "SSAI @ 37C final OD=2.3 @ 16.6h, 492mg/3L, 0% soluble
ss autoinduction @37C overnight, OD=2.1@16.6hrs, 487mg/3L, 70%soluble
AF 0.3mM IPTG induce@10C 24hrs OD=0.2@5.6h, 19mg/3L, 50%soluble
AF, 0.3mM IPTG @ 37C, IND @ OD 0.55 @ 4hrs, 476mg/12L, 0% soluble"

input.lines.each do |line|
  p extract(line)
end

# ["SSAI ", "37", "2.3", "16.6", "492", "3", "0"]
# ["ss autoinduction ", "37", "2.1", "16.6", "487", "3", "70"]
# ["AF 0.3mM IPTG induce", "10", "24", "0.2", "5.6", "19", "3", "50"]
# ["AF, 0.3mM IPTG ", "37", "0.55", "4", "476", "12", "0"]

答案 1 :(得分:0)

http://ruby-doc.org/core-2.1.0/String.html#method-i-scan

 > s = "SSAI @ 37C final OD=2.3 @ 16.6h, 492mg/3L, 0% soluble"
 => "SSAI @ 37C final OD=2.3 @ 16.6h, 492mg/3L, 0% soluble"

 > s.scan(/(.*?) @ (\d+)C .*?OD=([\d.]+) @ ([\d.]+)h, (\d+)mg.*?, (\d+)%/)
 => [["SSAI", "37", "2.3", "16.6", "492", "0"]]

或者,使用该正则表达式,然后设置$ 1,$ 2等...

答案 2 :(得分:0)

这将为每个所需的集合生成子组。但是,它不能将第五组中的492和3都放入一个子组中,因此它们分别成为第五和第六,并将%可溶性推到第七个子匹配:

/^(.*?) @ (\d+)\D+(\d+\.\d+) @ (\d+\.\d+)\D+(\d+)mg\/(\d+)L\D*(\d+)%/

答案 3 :(得分:0)

这是我能想到的。 如果有人有任何建议请拍。 感谢

 /(.*?)@{1}(\d+).*(\d+\.\d+)@(\d+\.\d+).*?(\d+)mg\/(\d+).*?(\d+)%/