嗨大家我正在使用ruby 1.8.7,我需要一个正则表达式将一个字符串拆分成单词。 以下是示例
SSAI @ 37C final OD=2.3 @ 16.6h, 492mg/3L, 0% soluble
ss autoinduction @37C overnight, OD=2.1@16.6hrs, 487mg/3L, 70%soluble
AF 0.3mM IPTG induce@10C 24hrs OD=0.2@5.6h, 19mg/3L, 50%soluble
AF, 0.3mM IPTG @ 37C, IND @ OD 0.55 @ 4hrs, 476mg/12L, 0% soluble
这是拆分(相对于第一个例子)
SSAI(everything before the first @)
37(integers after the first @)
2.3(float before the second @)
16.6(float after the second @)
492 and 3(pattern mg/*L)
0(before the %)
我有一组具有相同模式的字符串,我想运行正则表达式并导入到数据库中。
答案 0 :(得分:1)
这似乎可以解决所有输入问题。想法是分两步执行操作。首先按'/ @ /'拆分字段,保存第一个字段以供日后使用,然后在其余字段中搜索小数。
first, *rest = str.split(/@/)
rest.map!{|s| s.scan(/\d+\.?\d*/)}.flatten!
first
#=> "SSAI "
rest
#=> ["37", "2.3", "16.6", "492", "3", "0"]
完整示例:
def extract(source)
first, *rest = str.split(/@/)
rest.map!{|s| s.scan(/\d+\.?\d*/)}.flatten!
[first, *rest]
end
input = "SSAI @ 37C final OD=2.3 @ 16.6h, 492mg/3L, 0% soluble
ss autoinduction @37C overnight, OD=2.1@16.6hrs, 487mg/3L, 70%soluble
AF 0.3mM IPTG induce@10C 24hrs OD=0.2@5.6h, 19mg/3L, 50%soluble
AF, 0.3mM IPTG @ 37C, IND @ OD 0.55 @ 4hrs, 476mg/12L, 0% soluble"
input.lines.each do |line|
p extract(line)
end
# ["SSAI ", "37", "2.3", "16.6", "492", "3", "0"]
# ["ss autoinduction ", "37", "2.1", "16.6", "487", "3", "70"]
# ["AF 0.3mM IPTG induce", "10", "24", "0.2", "5.6", "19", "3", "50"]
# ["AF, 0.3mM IPTG ", "37", "0.55", "4", "476", "12", "0"]
答案 1 :(得分:0)
http://ruby-doc.org/core-2.1.0/String.html#method-i-scan
> s = "SSAI @ 37C final OD=2.3 @ 16.6h, 492mg/3L, 0% soluble"
=> "SSAI @ 37C final OD=2.3 @ 16.6h, 492mg/3L, 0% soluble"
> s.scan(/(.*?) @ (\d+)C .*?OD=([\d.]+) @ ([\d.]+)h, (\d+)mg.*?, (\d+)%/)
=> [["SSAI", "37", "2.3", "16.6", "492", "0"]]
或者,使用该正则表达式,然后设置$ 1,$ 2等...
答案 2 :(得分:0)
这将为每个所需的集合生成子组。但是,它不能将第五组中的492和3都放入一个子组中,因此它们分别成为第五和第六,并将%可溶性推到第七个子匹配:
/^(.*?) @ (\d+)\D+(\d+\.\d+) @ (\d+\.\d+)\D+(\d+)mg\/(\d+)L\D*(\d+)%/
答案 3 :(得分:0)
这是我能想到的。 如果有人有任何建议请拍。 感谢
/(.*?)@{1}(\d+).*(\d+\.\d+)@(\d+\.\d+).*?(\d+)mg\/(\d+).*?(\d+)%/