在红宝石中存储正则表达式匹配?

时间:2011-06-18 06:10:17

标签: ruby regex match

我正在使用ruby解析文件以更改数据格式。我创建了一个正则表达式,它有三个匹配组,我想暂时存储在变量中。由于一切都是零,我无法将比赛存储起来。

这是我迄今为止所阅读的内容。

regex = '^"(\bhttps?://[-\w+&@#/%?=~_|$!:,.;]*[\w+&@#/%=~_|$])","(\w+|[\w._%+-]+@[\w.-]+\.[a-zA-Z]{2,4})","(\w{1,30})'

begin
  file = File.new("testfile.csv", "r")
  while (line = file.gets)
    puts line
    match_array = line.scan(/regex/)
    puts $&
  end
  file.close
end

以下是我用于测试的一些示例数据。

"https://mail.google.com","Master","password1","","https://mail.google.com","",""
"https://login.sf.org","monster@gmail.com","password2","https://login.sf.org","","ctl00$ctl00$ctl00$body$body$wacCenterStage$standardLogin$tbxUsername","ctl00$ctl00$ctl00$body$body$wacCenterStage$standardLogin$tbxPassword"
"http://www.facebook.com","Beast","12345678","https://login.facebook.com","","email","pass"
"http://www.own3d.tv","Earth","passWOrd3","http://www.own3d.tv","","user_name","user_password"

谢谢你,
LF4

1 个答案:

答案 0 :(得分:5)

这不起作用:

match_array = line.scan(/regex/)

这只是使用文字“正则表达式”字符串作为正则表达式,而不是regex变量中的字符串。您可以将大丑陋的正则表达式放入scan或创建Regexp实例:

regex = Regexp.new('^"(\bhttps?://[-\w+&@#/%?=~_|$!:,.;]*[\w+&@#/%=~_|$])","(\w+|[\w._%+-]+@[\w.-]+\.[a-zA-Z]{2,4})","(\w{1,30})')
# ...
match_array = line.scan(regex)

您应该使用CSV库(一个附带Ruby:1.8.71.9)来解析CSV文件,然后将正则表达式应用于CSV中的每一列。你会遇到更少的引用和逃避问题。

相关问题