默认情况下,#split
方法的工作方式如下:
"id,name,title(first_name,last_name)".split(",")
将为您提供以下输出:
["id", "name", "title(first_name", "last_name)"]
但我想要以下内容:
["id", "name", "title(first_name,last_name)"]
因此,我使用以下正则表达式(来自此answer)使用拆分来获得所需的输出:
"id,name,title(first_name,last_name)".split(/,(?![^(]*\))/)
但是,当我使用另一个字符串时,这是我上面的实际输入,逻辑失败了。我的实际字符串是:
"id,name,title(first_name,last_name,address(street,pincode(id,code)))"
并提供以下输出:
["id", "name", "title(first_name", "last_name", "address(street", "pincode(id,code)))"]
而不是
["id", "name", "title(first_name,last_name,address(street,pincode(id,code)))"]
答案 0 :(得分:3)
更新了答案
由于前面的回答没有注意到评论中正确指出的所有案例,我用其他解决方案更新答案。
此方法使用分隔符|
分隔有效逗号,然后使用它来使用String#split
分割字符串。
class TokenArrayParser
SPLIT_CHAR = '|'.freeze
def initialize(str)
@str = str
end
def parse
separate_on_valid_comma.split(SPLIT_CHAR)
end
private
def separate_on_valid_comma
dup = @str.dup
paren_count = 0
dup.length.times do |idx|
case dup[idx]
when '(' then paren_count += 1
when ')' then paren_count -= 1
when ',' then dup[idx] = SPLIT_CHAR if paren_count.zero?
end
end
dup
end
end
%w(
id,name,title(first_name,last_name)
id,name,title(first_name,last_name,address(street,pincode(id,code)))
first_name,last_name,address(street,pincode(id,code)),city(name)
a,b(c(d),e,f)
id,name,title(first_name,last_name),pub(name,address)
).each {|str| puts TokenArrayParser.new(str).parse.inspect }
# =>
# ["id", "name", "title(first_name,last_name)"]
# ["id", "name", "title(first_name,last_name,address(street,pincode(id,code)))"]
# ["first_name", "last_name", "address(street,pincode(id,code))", "city(name)"]
# ["a", "b(c(d),e,f)"]
# ["id", "name", "title(first_name,last_name)", "pub(name,address)"]
我确信这可以进一步优化。
答案 1 :(得分:3)
def doit(str)
split_here = 0.chr
stack = 0
s = str.gsub(/./) do |c|
ret = c
case c
when '('
stack += 1
when ','
ret = split_here, if stack.zero?
when ')'
raise(RuntimeError, "parens are unbalanced") if stack.zero?
stack -= 1
end
ret
end
raise(RuntimeError, "parens are unbalanced, stack at end=#{stack}") if stack > 0
s.split(split_here)
end
doit "id,name,title(first_name,last_name)"
#=> ["id", "name", "title(first_name,last_name)"]
doit "id,name,title(first_name,last_name,address(street,pincode(id,code)))"
#=> ["id", "name", "title(first_name,last_name,address(street,pincode(id,code)))"]
doit "a,b(c(d),e,f)"
#=> ["a", "b(c(d),e,f)"]
doit "id,name,title(first_name,last_name),pub(name,address)"
#=> ["id", "name", "title(first_name,last_name)", "pub(name,address)"]
doit "a,b(c)d),e,f)"
#=> RuntimeError: parens are unbalanced
doit "a,b(c(d),e),f("
#=> RuntimeError: parens are unbalanced, stack at end=["("]
当且仅当遇到stack
为零时,才会拆分逗号。如果要将其拆分,则将其更改为不在字符串中的字符(split_here
)。 (我使用0.chr
)。然后将该字符串拆分为split_here
。
答案 2 :(得分:-1)
这可能是一种方法:
"id,name,title(first_name,last_name)".split(",")[0..1] << "id,name,title(first_name,last_name)".split(",")[-2..-1].join
创建一个重复的字符串并将它们分开,然后将第一个字符串的前两个元素与第二个字符串副本的连接的最后两个元素组合在一起。至少在这种特定情况下,它会给你想要的结果。