php在指定标记之前删除标记

时间:2013-05-09 19:47:57

标签: php html dom

我想在标题开始之前删除所有图片标记,但它们不是以相同的方式嵌套。然后删除空标签。

<div class="c2">
  <img src="image/file" width="480" height="360" alt="Image" />
</div>
<div class="c2">
  <div class="headline">
    headline
  </div>
  <div class="headline">
    headline2
  </div>
</div>

和不同的嵌套标签,如

<div class="c2">
  <p>
    <img src="image/A.JPG" width="480" height="319" alt="Image" />
  </p>
  <div class="headline">
    A headline
  </div>
</div>

我认为这可以递归解决,但我不知道如何。

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

编辑:如果您只想删除<img>后跟<div><div class="headline>"<div class="headline">,请使用此xpath:

$imgs = $xpath->query("//img[../following-sibling::div[1]/div/@class='headline' or ../following-sibling::div[1]/@class='headline']");

看到它有效:http://codepad.viper-7.com/QhprLP

这样做:

$doc = new DOMDocument();
$doc->loadHTML($x); // assuming HTML in $x
$xpath = new DOMXpath($doc);
$imgs = $xpath->query("//img"); // select all <img> nodes

foreach ($imgs as $img) { // loop through list of all <img> nodes
$parent = $img->parentNode; 
$parent->removeChild($img); // delete <img> node
if ($parent->childNodes->length >= 1) // if parent node of <img> is empty delete it
        $parent->parentNode->removeChild($parent);
}

echo htmlentities($doc->saveHTML()); // display the new HTML

看到它有效:http://codepad.viper-7.com/350Hw6