Question

我需要一个RegEx用于下一个字符串：

caption
"caption"
<caption>
[caption]
(caption)
etc

在此上下文中，标题是[a-zA-Z]。我可以对"之类的相同符号使用反向引用，但是我应该对()，[]，<>等对符号做些什么呢？

Answer 1

如果你的正则表达式引擎支持条件，可以这样做：

(?:(")|(<)|(\[)|(\())[A-Za-z]*(?(1)")(?(2)>)(?(3)\])(?(4)\))

这不比@stema或@Anirudh提出的解决方案更具可读性：）

<强>解释

(?:       # Match either...
 (")      # a quote, capture it in group 1
|         # or
 (<)      # an opening angle bracket --> group 2
|         # or
 (\[)     # an opening bracket --> group 3
|         # or
 (\()     # on opening parenthesis --> group 4
)         # End of alternation
[A-Za-z]* # Match any ASCII letters
(?(1)")   # If group 1 matched before, then match a quote
(?(2)>)   # If group 2 matched before, then match a closing angle bracket
(?(3)\])  # If group 3 matched before, then match a closing bracket
(?(4)\))  # If group 4 matched before, then match a closing parenthesis

Answer 2

您需要明确指定..

\[[a-zA-Z]+\]|\<[a-zA-Z]+\>|"[a-zA-Z]+"|\([a-zA-Z]+\)

Answer 3

模式没有机会知道哪两个不同的角色属于一起。您必须在交替中列出这些案例：

(["'])[a-zA-Z]*\1|<[a-zA-Z]*>|\[[a-zA-Z]]*\]|\([a-zA-Z)]*\)

见here on Regexr

Answer 4

我相信除了许多|

之外无法完成

<[a-zA-z]+>|\[[a-zA-z]+\]|\([a-zA-z]+\)

或冒更多误报的风险

[<\[\(][a-zA-z]+[>\]\)]

如果您需要替换它，许多编程语言都支持回调函数

http://docs.python.org/2/library/re.html#re.sub

如果repl是一个函数，则每次非重叠都会调用它发生模式。该函数采用单个匹配对象参数，并返回替换字符串。例如：

RegEx用于不同的支架对

4 个答案: