Question

我正在尝试创建一个preg_match，它在HTML文档中找到一系列标记。

示例HTML：

＆＃13;

<div class="importantclass">
  <p>some thing</p>
  <p>some more things</p>
</div>
<div class="importantclass">
  <b>some text</b>
  <p>NEEDLE</p>
</div>

＆＃13;

我需要找到div类的组合=＆＃34; importantclass＆＃34;以及后面带有特定NEEDLE-Text的p-tag。

然后我需要返回开始div类的位置。注意：我不想得到匹配，因为第一次出现了重要的类div。

是否有可能在不使用DOM且只使用regexp的情况下执行此操作？

感谢您的提示！

Answer 1

这对你有用吗？

<?php
    $html = <<< LOB
<div class="importantclass">
  <p>some thing</p>
  <p>some more things</p>
</div>
<div class="importantclass">
  <b>some text</b>
  <p>FIND ME</p>
</div>
LOB;

    $needle = "FIND ME";
    preg_match_all('%(<div.*?class="importantclass">.*?</div>)%sim', $html, $matches, PREG_PATTERN_ORDER);
    for ($i = 0; $i < count($matches[1]); $i++) {
        if (preg_match("%<p>$needle</p>%im", $matches[1][$i])) {
            echo "MATCH FOUND!<br>";
            echo "POSITION $i<br>";
            echo htmlentities( $matches[1][$i]);
        }
}

DEMO

使用preg_match查找特定的HTML标记组合

1 个答案: