我需要将一个字符串拆分为Ruby中的部分列表,但我需要忽略paramentheses中的内容。例如:
A +4, B +6, C (hello, goodbye) +5, D +3
我希望结果列表是:
[0]A +4
[1]B +6
[2]C (hello, goodbye) +5
[3]D +3
但我不能简单地用逗号分割,因为这会分割括号的内容。有没有办法在没有预先解析括号中的逗号的情况下拆分东西?
感谢。
答案 0 :(得分:13)
试试这个:
s = 'A +4, B +6, C (hello, goodbye) +5, D +3'
tokens = s.scan(/(?:\(.*?\)|[^,])+/)
tokens.each {|t| puts t.strip}
输出:
A +4
B +6
C (hello, goodbye) +5
D +3
一个简短的解释:
(?: # open non-capturing group 1
\( # match '('
.*? # reluctatly match zero or more character other than line breaks
\) # match ')'
| # OR
[^,] # match something other than a comma
)+ # close non-capturing group 1 and repeat it one or more times
另一个选择是在逗号上分割,后跟一些空格只有当向前看时可以看到的第一个括号是一个左括号(或根本没有括号:即字符串的结尾):
s = 'A +4, B +6, C (hello, goodbye) +5, D +3'
tokens = s.split(/,\s*(?=[^()]*(?:\(|$))/)
tokens.each {|t| puts t}
将生成相同的输出,但我发现scan
方法更清晰。
答案 1 :(得分:5)
string = "A +4, B +6, C (hello, goodbye) +5, D +3"
string.split(/ *, *(?=[^\)]*?(?:\(|$))/)
# => ["A +4", "B +6", "C (hello, goodbye) +5", "D +3"]
这个正则表达式是如何工作的:
/
*, * # find comma, ignoring leading and trailing spaces.
(?= # (Pattern in here is matched against but is not returned as part of the match.)
[^\)]*? # optionally, find a sequence of zero or more characters that are not ')'
(?: # <non-capturing parentheses group>
\( # left paren ')'
| # - OR -
$ # (end of string)
)
)
/