如何在Ruby中拆分包含集合的字符串?

时间:2018-11-16 15:13:39

标签: ruby regex split

我是新来的论坛。我目前正在尝试使用以下字符串:

65101km,Sedan,Manual,18131A,FWD,Used,5.5L/100km,Toyota,camry,SE,{AC,Heated Seats, Heated Mirrors, Keyless Entry},2010

并将其拆分以获取此信息:

65101km
Sedan
Manual
18131A
FWD
Used
5.5L/100km
Toyota
camry
SE
{AC, Heated Seats, Heated Mirrors, Keyless Entry}
2010

我有以下正则表达式:

data_from_file.split(/[{},]+/)

但是我很难保留背景。

有什么想法吗?

2 个答案:

答案 0 :(得分:1)

str = "65101km,Sedan,Manual,18131A,FWD,Used,5.5L/100km,Toyota,camry,SE,{AC,Heated Seats, Heated Mirrors, Keyless Entry},2010"

r = /
    (?<=\A|,)  # match the beginning of the string or a comma in a positive lookbehind
    (?:        # begin a non-capture group
      {.*?}    # match an open brace followed by any number of characters,
               # lazily, followed by a closed brace
      |        # or
      .*?      # match any number of characters, lazily 
    )          # close non-capture group
    (?=,|\z)   # match a comma or the end of the string in a positive lookahead
    /x         # free-spacing regex definition mode

str.scan r
  #=> ["65101km", "Sedan", "Manual", "18131A", "FWD", "Used", "5.5L/100km", "Toyota",
  #    "camry", "SE", "{AC,Heated Seats, Heated Mirrors, Keyless Entry}", "2010"]

后面有两个注释。我将用一个更简单的字符串来说明这些。

str = "65101km,Sedan,{AC,Heated Seats},2010"

{.*?}必须在.*?中的(?:{.*?}|.*?)之前

如果

r = /(?<=\A|,)(?:.*?|{.*?})(?=,|\z)/

然后

str.scan r
  #=> ["65101km", "Sedan", "{AC", "Heated Seats}", "2010"]

匹配项.*必须是懒惰(又名非贪婪

如果

r = /(?<=\A|,)(?:{.*?}|.*)(?=,|\z)/

然后

str.scan r
  #=> ["65101km,Sedan,{AC,Heated Seats},2010"]

如果

r = /(?<=\A|,)(?:{.*}|.*?)(?=,|\z)/

然后

"65101km,Sedan,{AC,Heated Seats},2010,{starter motor, pneumatic tires}".scan r
  #=> ["65101km", "Sedan", "{AC,Heated Seats},2010,{starter motor, pneumatic tires}"]

答案 1 :(得分:1)

您可以使用

s.scan(/(?:{[^{}]*}|[^,])+/)

请参见RubularRegex.101演示。

模式详细信息

  • (?:-一个非捕获组的开始:
    • {[^{}]*}-{,除了{}以外的0个或更多字符,然后是}
  • |-或
    • [^,]-除,之外的任意1个字符
  • )+-重复1次或多次。