使用String#split方法

时间:2018-01-01 13:10:31

标签: ruby-on-rails ruby string split

默认情况下,#split方法的工作方式如下:

"id,name,title(first_name,last_name)".split(",")

将为您提供以下输出:

["id", "name", "title(first_name", "last_name)"]

但我想要以下内容:

["id", "name", "title(first_name,last_name)"]

因此,我使用以下正则表达式(来自此answer)使用拆分来获得所需的输出:

"id,name,title(first_name,last_name)".split(/,(?![^(]*\))/)

但是,当我使用另一个字符串时,这是我上面的实际输入,逻辑失败了。我的实际字符串是:

"id,name,title(first_name,last_name,address(street,pincode(id,code)))"

并提供以下输出:

["id", "name", "title(first_name", "last_name", "address(street", "pincode(id,code)))"]

而不是

["id", "name", "title(first_name,last_name,address(street,pincode(id,code)))"]

3 个答案:

答案 0 :(得分:3)

更新了答案

由于前面的回答没有注意到评论中正确指出的所有案例,我用其他解决方案更新答案。

此方法使用分隔符|分隔有效逗号,然后使用它来使用String#split分割字符串。

class TokenArrayParser
  SPLIT_CHAR = '|'.freeze

  def initialize(str)
    @str = str
  end

  def parse
    separate_on_valid_comma.split(SPLIT_CHAR)
  end

  private

  def separate_on_valid_comma
    dup = @str.dup
    paren_count = 0
    dup.length.times do |idx|
      case dup[idx]
      when '(' then  paren_count += 1
      when ')' then paren_count -= 1
      when ',' then dup[idx] = SPLIT_CHAR if paren_count.zero?
      end
    end

    dup
  end
end

%w(
  id,name,title(first_name,last_name)
  id,name,title(first_name,last_name,address(street,pincode(id,code)))
  first_name,last_name,address(street,pincode(id,code)),city(name)
  a,b(c(d),e,f)
  id,name,title(first_name,last_name),pub(name,address)
).each {|str| puts TokenArrayParser.new(str).parse.inspect }

# =>
# ["id", "name", "title(first_name,last_name)"]
# ["id", "name", "title(first_name,last_name,address(street,pincode(id,code)))"]
# ["first_name", "last_name", "address(street,pincode(id,code))", "city(name)"]
# ["a", "b(c(d),e,f)"]
# ["id", "name", "title(first_name,last_name)", "pub(name,address)"]

我确信这可以进一步优化。

答案 1 :(得分:3)

def doit(str)
  split_here = 0.chr
  stack = 0
  s = str.gsub(/./) do |c|
    ret = c
    case c
    when '('
      stack += 1
    when ','
      ret = split_here, if stack.zero?
    when ')'
      raise(RuntimeError, "parens are unbalanced") if stack.zero?
      stack -= 1
    end
    ret
  end
  raise(RuntimeError, "parens are unbalanced, stack at end=#{stack}") if stack > 0
  s.split(split_here)
end

doit "id,name,title(first_name,last_name)"
  #=> ["id", "name", "title(first_name,last_name)"]
doit "id,name,title(first_name,last_name,address(street,pincode(id,code)))"
  #=> ["id", "name", "title(first_name,last_name,address(street,pincode(id,code)))"]
doit "a,b(c(d),e,f)"
  #=> ["a", "b(c(d),e,f)"]
doit "id,name,title(first_name,last_name),pub(name,address)"
  #=> ["id", "name", "title(first_name,last_name)", "pub(name,address​)"]
doit "a,b(c)d),e,f)"
  #=> RuntimeError: parens are unbalanced
doit "a,b(c(d),e),f("
  #=> RuntimeError: parens are unbalanced, stack at end=["("]

当且仅当遇到stack为零时,才会拆分逗号。如果要将其拆分,则将其更改为不在字符串中的字符(split_here)。 (我使用0.chr)。然后将该字符串拆分为split_here

答案 2 :(得分:-1)

这可能是一种方法:

"id,name,title(first_name,last_name)".split(",")[0..1] << "id,name,title(first_name,last_name)".split(",")[-2..-1].join

创建一个重复的字符串并将它们分开,然后将第一个字符串的前两个元素与第二个字符串副本的连接的最后两个元素组合在一起。至少在这种特定情况下,它会给你想要的结果。