在拆分方法正则表达式中排除引号

时间:2015-05-12 02:05:36

标签: ruby-on-rails ruby regex

我很难理解如何在ruby中使用分割方法使用regexp。

我有字符串:

"Mark Sally 'John Smith' Steve"

我正在尝试获取数组:

 ["Mark", "Sally", "John Smith", "Steve"]

3 个答案:

答案 0 :(得分:0)

而不是split字符串,我会考虑匹配或使用解析器。

s = "Mark Sally 'John Smith' Steve"
p s.scan(/'([^']+)'|(\S+)/).flatten.compact
#=> ["Mark", "Sally", "John Smith", "Steve"]

答案 1 :(得分:0)

这看起来像是CSV阅读器的工作 - 它专门处理按引号组合在一起的多个单词。

require 'csv'

line = "Mark Sally 'John Smith' Steve"
line.parse_csv(:col_sep => " ", :quote_char => "'")
puts line  # ["Mark", "Sally", "John Smith", "Steve"]

答案 2 :(得分:0)

我建议你scan而不是split

r = /
    (?<=') # match a single quote in a positive lookbehind
    .*?    # match any number of any characters, lazily
    (?=')  #match a single quote in a positive lookahead
    |   # or
    \w+ # match one or more word characters
    /x

"Mark Sally 'John Smith' Steve".scan(r)
   #=> ["Mark", "Sally", "John Smith", "Steve"] 

请注意,此处的顺序很重要:

r = /\w+|(?<=').*?(?=')/
"Mark Sally 'John Smith' Steve".scan(r)
   #=> ["Mark", "Sally", "John", "Smith", "Steve"]