如何匹配重复但不相同的类似组

时间:2013-01-10 12:30:52

标签: regex

我有许多与产品相关的字符串。每个都有参考数字,我想创建一个正则表达式,如果不止一次提到不同的参考数字。因此,给出以下示例:

"AB01 MyProduct" >>> No match - because there is only one ID
"AB02 MyOtherProduct" >>> No match - because there is only one ID
"AB03 YetAnotherProduct" >>> No match - because there is only one ID
"AnAccessory for AB01, AB02, AB03 or AB101" >>> Matches!!
"AB02 MyOtherProduct, MyOtherProduct called the AB02" >>> No match - because the codes are the same

有人能给我一些线索吗?

1 个答案:

答案 0 :(得分:2)

如果你的正则表达式引擎支持negative lookaheads,这就可以解决问题:

(AB\d+).*?(?!\1)AB\d+

如果有两个序列匹配AB\d+且第二个序列与第一个序列不同(由负向前瞻确定),则匹配。

说明:

(           # start capture group 1
 AB         # match `AB` literally
 \d+        # match one or more digits
)           # end capture group one
.*?         # match any sequence of characters, non-greedy
(?!         # start negative lookahead, match this position if it does not match
 \1         # whatever was captured in capture group 1
)           # end lookahead
AB          # match `AB` literally
\d+         # match one or more digits

测试(JavaScript):

> var pattern = /(AB\d+).*?(?!\1)AB\d+/;
> pattern.test("AB01 MyProduct")
  false
> pattern.test("AnAccessory for AB01, AB02, AB03 or AB101")
  true
> pattern.test("AB02 MyOtherProduct, MyOtherProduct called the AB02")
  false