PHP DOM解析器移动关闭Div标签

时间:2018-02-14 21:51:33

标签: php html string domparser

这是我的代码:

$myHtml = '
<div class="div-class">
    <p>text</p>

    <p><a href="#">text</a></p>
</div>

<ul class="some-class">
    <li><a href="#" target="_blank" title="something something"><img src="" alt=""></a>
    </li>
    <li><a href="" target="_blank" title=""><img src="" alt=""></a>
    </li>
    <li><a href="" target="_blank" title=""><img src=""></a>
    </li>
</ul>
';

$doc = new \DOMDocument();
$doc->loadHTML($myHtml, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new \DOMXPath($doc);
$anchors = $xpath->query("//a[@title='something something']");
$list = $xpath->query("//ul[@class='some-class']")[0];
foreach ($anchors as $a) {
    $list->removeChild($a->parentNode);
}

var_dump($doc->saveHTML());

基本上,我试图删除一个包含标题为“某事物”的锚标记的列表项。但是,当我在应用更改后保存html时,列表会在div标记内移动。为什么会这样?感谢。

1 个答案:

答案 0 :(得分:3)

ReasonForCalling_Play_prompt尝试纠正语法,并且它不会将ul元素视为无父对象,因此将其移至div内。如果你将它全部包裹在body标签周围,它就能正常工作。

loadHTML()实际上应该自动为你做包装,但你设置了loadHTML()标志,这会禁用它。

<?php
$myHtml = '
<html>
<body>
<div class="div-class">
    <p>text</p>

    <p><a href="#">text</a></p>
</div>

<ul class="some-class">
    <li><a href="#" target="_blank" title="something something"><img src="" alt=""></a>
    </li>
    <li><a href="" target="_blank" title=""><img src="" alt=""></a>
    </li>
    <li><a href="" target="_blank" title=""><img src=""></a>
    </li>
</ul>
</body>
</html>
';

$doc = new \DOMDocument();
$doc->loadHTML($myHtml, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new \DOMXPath($doc);
$anchors = $xpath->query("//a[@title='something something']");
$list = $xpath->query("//ul[@class='some-class']")[0];
foreach ($anchors as $a) {
    $list->removeChild($a->parentNode);
}

var_dump($doc->saveHTML());

LIBXML_HTML_NOIMPLIED

或者,没有LIBXML_HTML_NOIMPLIED标志:

<?php
$myHtml = '
<div class="div-class">
    <p>text</p>

    <p><a href="#">text</a></p>
</div>

<ul class="some-class">
    <li><a href="#" target="_blank" title="something something"><img src="" alt=""></a>
    </li>
    <li><a href="" target="_blank" title=""><img src="" alt=""></a>
    </li>
    <li><a href="" target="_blank" title=""><img src=""></a>
    </li>
</ul>
';

$doc = new \DOMDocument();
$doc->loadHTML($myHtml, LIBXML_HTML_NODEFDTD);
var_dump (libxml_get_errors());
$xpath = new \DOMXPath($doc);
$anchors = $xpath->query("//a[@title='something something']");
$list = $xpath->query("//ul[@class='some-class']")[0];
foreach ($anchors as $a) {
    $list->removeChild($a->parentNode);
}

var_dump($doc->saveHTML());

Demo