Question

我有一个字符串里面有字符，我想只匹配字符串周围的字符。

"This is a [1]test[/1] string. And [2]test[/2]"

Rubular http://rubular.com/r/f2Xwe3zPzo

目前，链接中的代码与特殊字符内的文本匹配，如何更改？

更新

澄清我的问题。它应该只在开始和结束具有相同数字时匹配。

"[2]first[/2] [1]second[/2]"

在上面的代码中，只有第一个匹配而不是第二个。特殊字符（第一个）内的文本应该被忽略。

Answer 1

试试这个：

(\[[0-9]\]).+?(\[\/[0-9]\])

Permalink以Rubular为例。

<强>更新

因为你想要删除特殊的＆＃39;字符，试试这个：

foo = "This is a [1]test[/1] string. And [2]test[/2]"
foo.gsub /\[\/?\d\]/, ""
# => "This is a test string. And test"

更新，第二部分

你只想删除特殊的＆＃39;周围标签匹配时的字符，那么这个：

foo = "This is a [1]test[/1] string. And [2]test[/2], but not [3]test[/2]"
foo.gsub /(?:\[(?<number>\d)\])(?<content>.+?)(?:\[\/\k<number>\])/, '\k<content>'
# => "This is a test string. And test, but not [3]test[/2]"

Answer 2

\[([0-9])\].+?\[\/\1\]

([0-9])是一个捕获，因为它被括号括起来。 \1告诉它使用该捕获的结果。如果您有多个捕获，也可以引用它们，\2，\3等。

Rubular

你也可以使用named capture而不是\1来使它变得不那么神秘。如：\[(?<number>[0-9])\].+?\[\/\k<number>\]

Answer 3

这是一种使用String#gsub形式的方法。我的想法是将"[1]test[/1]"之类的字符串拉入块中，然后删除不需要的位。

str = "This is a [1]test[/1] string. And [2]test[/2], plus [3]test[/99]"

r = /
    \[    # match a left bracket
    (\d+) # capture one or more digits in capture group 1 
    \]    # match a right bracket
    .+?   # match one or more characters lazily
    \[\/  # match a left bracket and forward slash
    \1    # match the contents of capture group 1 
    \]    # match a right bracket
    /x

str.gsub(r) { |s| s[/(?<=\]).*?(?=\[)/] }
  #=> "This is a test string. And test, plus [3]test[/99]"

除此之外：当我第一次听到命名的捕获组时，它们似乎是一个好主意，但现在我想知道它们是否真的比\1，\2 ....更容易阅读正则表达式/ p>

正则表达式匹配文本周围的字符

3 个答案: