Question

如何将img标记与否定前瞻匹配？

/(<img (?!.*\/>).*\/>)/i

Answer 1

那将是：

/(<img(?:.(?!\/>))+\/>/i

但这不是最有效的解决方案。使用前瞻，最有效的是：

/(<img[^>]+(?:\/(?!>)[^>]*)*\/>)/i

分解它，这给出了：

(              # begin capture
    <img       # literal "<img", followed by
    [^>]+      # everything but ">", once or more, followed by
    (?:        # begin non capturing group
      /(?!>)   # a "/", as long as it is not followed by a ">", followed by
      [^>]*    # everything but ">", zero or more times,
    )*         # zero or more times, followed by
    />         # literal "/>"
)              # end capture

这是normal* (special normal*)*的另一个应用，其中normal为[^>]，special为/(?!>)：

$ perl -ne 'm,(<img[^>]+(?:/(?!>)[^>]*)*/>), and print "-->$1<--\n"' <<EOF
no image tag here
Here there is one: <img src="foo/bar.gif"/>
<img whatever bla bla> (no match, no / before >)
EOF

--><img src="foo/bar.gif"/><--

Answer 2

为什么你需要在这里预见，不能这样做：

/(<img\s[^>]+>)/i

但是，允许我强烈建议您在此使用DOM Parser而不是RegEx，因为使用RegEx可能容易出错image这样的标记：

<img src="greater.jpg" alt="x > y" height="10" width="10">

Answer 3

使用不合格的pattern modifier U 。

U（PCRE_UNGREEDY）

此修饰符会反转量词的“贪婪”，以便它们在默认情况下不会贪婪，但如果后面跟着？则会变得贪婪。它与Perl不兼容。它也可以通过（？U ）modifier setting within the pattern或量词后面的问号（例如。*？）设置。

这会抓住所有内容，直到遇到/>（img标记的结尾）。

'/(<img(.*)/>)/Ui'

如何匹配img标签 - 否定前瞻

3 个答案: