Question

可能重复：
  RegEx match open tags except XHTML self-contained tags
  How to parse and process HTML/XML with PHP?

我有一行代码作为图像下载脚本的一部分，如下所示：

preg_match_all('|<img.*?src=[\'"](.*?)[\'"].*?>|i', $content, $matches);

我需要改变它以包括：

id="iwi"

在preg_match_all命令中。 img始终采用以下格式：

我尝试了一些不同的变体并且遇到了错误，最后尝试没有下面的引号但仍然没有，我的语法错了吗？

preg_match_all('|<img.*?id=iwi.*?src=[\'"](.*?)[\'"].*?>|i', $content, $matches);

Answer 1

这是The Pony He Comes的头号问题。您不知道它是<img id="iwi" src="image.png" />还是<img src="image.png" id="iwi" />。

相反，您应该使用解析器：

$dom = new DOMDocument();
$dom->loadHTML($content);
$img = $dom->getElementById("iwi");
$src = $img->getAttribute("src");

Answer 2

如果你坚持使用preg尽管有所有反对意见，这些方法也有效;

// [\'"]* is useful cos sometime can't find " or ', and * means 0 or 1 time search
preg_match_all('~<img.*?id=[\'"]*([^\s\'"]*).*?src=[\'"]*([^\s\'"]*).*?>~i', $content, $matches);
preg_match_all('~<img.*?id=[\'"]*(?P<id>[^\s\'"]*).*?src=[\'"]*(?P<src>[^\s\'"]*).*?>~i', $content, $matches);
print_r($matches);

preg_match_all正确的语法

2 个答案: