Question

我正在使用DOMDocument从网站上获取HTML。我想在<body></body>中获取html，我得到了它。但是body里面是<nav>...</nav>块。如何仅使用DOMDocument排除<nav></nav>阻止。

这是我的代码：

<!DOCTYPE html>
<head>
    <title>Title Here</title>
<head>
<?php
  $d = new DOMDocument;
  $mock = new DOMDocument;
  $internalErrors = libxml_use_internal_errors(true);
  $d->loadHTML(file_get_contents('http://www.example.com'));
  $body = $d->getElementsByTagName('body')->item(0);
  foreach ($body->childNodes as $child){
      $mock->appendChild($mock->importNode($child, true));
  }
  libxml_use_internal_errors($internalErrors);
  echo $mock->saveHTML(); //<body>.....</body>
?>
</html>

Answer 1

请看这个接受的答案， PHP DOM: Get NodeValue excluding the child nodes

您可以在收集身体的所有子节点后立即删除“nav”节点。

如何使用DOMDocument排除body标签中的特定html块？

1 个答案: