捕获并替换包含括号的列表中的每个单词

时间:2018-01-07 01:05:33

标签: ruby regex

说我有一个文本,如

"tnheitanhiaiin [ hello, there, will, you, help ] thitnahioetnaeitn
tnhetnh [ me, figure, this, out ] ihnteahntanitnh
nhoietnaiotniaehntehtnea [ please, because, i, dont, know ] thnthen
"

如何捕捉括号内的每个单词,以便用单引号将它们包围起来?

我尝试了\[\s?(?:(\w*),?\s?)+\]但它似乎无法捕捉任何东西,尽管它与括号内的部分相匹配。

括号内的单词可以是任何内容。

我希望在每一行都使用gsub。

3 个答案:

答案 0 :(得分:1)

你可以试试这个:

original = "tnheitanhiaiin [ hello, there, will, you, help ] thitnahioetnaeitn\ntnhetnh [ me, figure, this, out ] ihnteahntanitnh\nnhoietnaiotniaehntehtnea [ please, because, i, dont, know ] thnthen\n"
clone = original
original.scan(/\[(.*)\]/).flatten.map { |elem| [elem, elem.gsub(/\w+/) { |match| %Q('#{match}') }] }.each { |(pattern, replacement)| clone.gsub!(pattern, replacement) }
puts clone # =>
# tnheitanhiaiin [ 'hello', 'there', 'will', 'you', 'help' ] thitnahioetnaeitn
# tnhetnh [ 'me', 'figure', 'this', 'out' ] ihnteahntanitnh
# nhoietnaiotniaehntehtnea [ 'please', 'because', 'i', 'dont', 'know' ] thnthen

答案 1 :(得分:1)

r = /
    (?<=[ ])  # match a space in a positive lookbehind
    \p{L}+    # match one or more letters
    (?=       # begin a positive lookahead
      [^\[]+? # match one or more characters other than a left bracket, lazily
      \]      # match a right bracket
    )         # end the positive lookahead
    /x        # free-spacing regex definition mode

str成为问题中定义的字符串,我们可以用括号括起括号之间的单词,如下所示。

str.gsub(r) { |s| "'#{s}'" }
  #=> "tnheitanhiaiin [ 'hello', 'there', 'will', 'you', 'help' ]
  #    thitnahioetnaeitn\ntnhetnh [ 'me', 'figure', 'this', 'out' ]
  #    ihnteahntanitnh\nnhoietnaiotniaehntehtnea [ 'please', 'because',
  #    'i', 'dont', 'know' ] thnthen\n"

相反,如果我们希望提取这些字词,我们会String#scan

str.scan(r)
  #=> ["hello", "there", "will", "you", "help", "me", "figure", "this",
  #    "out", "please", "because", "i", "dont", "know"]

[^\[]+?末尾的问号(为了懒惰而非贪婪)是为了提高效率,但不是必需的。

我使用了自由间隔定义模式来使正则表达式自我记录。通常,它将写成如下。

     /(?<= )\p{L}+(?=[^\[]+?\])/

这假定(如示例中)括号是匹配的而不是嵌套的,带括号的单词前面有空格,后跟逗号或空格。如果关于括号之间的字符周围的字符的假设不正确,则可以调整正则表达式。

答案 2 :(得分:0)

可能是一个双重gsub:

s = "tnheitanhiaiin [ hello, there, will, you, help ] thitnahioetnaeitn\ntnhetnh [ me, figure, this, out ] ihnteahntanitnh\nnhoietnaiotniaehntehtnea [ please, because, i, dont, know ] thnthen\n"

s.gsub(/\[.*?\]/) { |m| m.gsub(/\w+/, '\'\0\'') }
 #=> "tnheitanhiaiin [ 'hello', 'there', 'will', 'you', 'help' ] thitnahioetnaeitn\ntnhetnh [ 'me', 'figure', 'this', 'out' ] ihnteahntanitnh\nnhoietnaiotniaehntehtnea [ 'please', 'because', 'i', 'dont', 'know' ] thnthen\n"