说我有一个文本,如
"tnheitanhiaiin [ hello, there, will, you, help ] thitnahioetnaeitn
tnhetnh [ me, figure, this, out ] ihnteahntanitnh
nhoietnaiotniaehntehtnea [ please, because, i, dont, know ] thnthen
"
如何捕捉括号内的每个单词,以便用单引号将它们包围起来?
我尝试了\[\s?(?:(\w*),?\s?)+\]
但它似乎无法捕捉任何东西,尽管它与括号内的部分相匹配。
括号内的单词可以是任何内容。
我希望在每一行都使用gsub。
答案 0 :(得分:1)
你可以试试这个:
original = "tnheitanhiaiin [ hello, there, will, you, help ] thitnahioetnaeitn\ntnhetnh [ me, figure, this, out ] ihnteahntanitnh\nnhoietnaiotniaehntehtnea [ please, because, i, dont, know ] thnthen\n"
clone = original
original.scan(/\[(.*)\]/).flatten.map { |elem| [elem, elem.gsub(/\w+/) { |match| %Q('#{match}') }] }.each { |(pattern, replacement)| clone.gsub!(pattern, replacement) }
puts clone # =>
# tnheitanhiaiin [ 'hello', 'there', 'will', 'you', 'help' ] thitnahioetnaeitn
# tnhetnh [ 'me', 'figure', 'this', 'out' ] ihnteahntanitnh
# nhoietnaiotniaehntehtnea [ 'please', 'because', 'i', 'dont', 'know' ] thnthen
答案 1 :(得分:1)
r = /
(?<=[ ]) # match a space in a positive lookbehind
\p{L}+ # match one or more letters
(?= # begin a positive lookahead
[^\[]+? # match one or more characters other than a left bracket, lazily
\] # match a right bracket
) # end the positive lookahead
/x # free-spacing regex definition mode
让str
成为问题中定义的字符串,我们可以用括号括起括号之间的单词,如下所示。
str.gsub(r) { |s| "'#{s}'" }
#=> "tnheitanhiaiin [ 'hello', 'there', 'will', 'you', 'help' ]
# thitnahioetnaeitn\ntnhetnh [ 'me', 'figure', 'this', 'out' ]
# ihnteahntanitnh\nnhoietnaiotniaehntehtnea [ 'please', 'because',
# 'i', 'dont', 'know' ] thnthen\n"
相反,如果我们希望提取这些字词,我们会String#scan。
str.scan(r)
#=> ["hello", "there", "will", "you", "help", "me", "figure", "this",
# "out", "please", "because", "i", "dont", "know"]
[^\[]+?
末尾的问号(为了懒惰而非贪婪)是为了提高效率,但不是必需的。
我使用了自由间隔定义模式来使正则表达式自我记录。通常,它将写成如下。
/(?<= )\p{L}+(?=[^\[]+?\])/
这假定(如示例中)括号是匹配的而不是嵌套的,带括号的单词前面有空格,后跟逗号或空格。如果关于括号之间的字符周围的字符的假设不正确,则可以调整正则表达式。
答案 2 :(得分:0)
可能是一个双重gsub:
s = "tnheitanhiaiin [ hello, there, will, you, help ] thitnahioetnaeitn\ntnhetnh [ me, figure, this, out ] ihnteahntanitnh\nnhoietnaiotniaehntehtnea [ please, because, i, dont, know ] thnthen\n"
s.gsub(/\[.*?\]/) { |m| m.gsub(/\w+/, '\'\0\'') }
#=> "tnheitanhiaiin [ 'hello', 'there', 'will', 'you', 'help' ] thitnahioetnaeitn\ntnhetnh [ 'me', 'figure', 'this', 'out' ] ihnteahntanitnh\nnhoietnaiotniaehntehtnea [ 'please', 'because', 'i', 'dont', 'know' ] thnthen\n"