假设我有这个包含HTML的评论栏:
<html>
<body>
<code class="hidden">
<!--
<div class="a">
<div class="b">
<div class="c">
<a href="link">Link Test 1</a>
</div>
<div class="c">
<a href="link">Link Test 2</a>
</div>
<div class="c">
<a href="link">Link Test 3</a>
</div>
</div>
</div>
-->
</code>
<code>
<!-- test -->
</code>
</body>
</html>
使用DOMXPath for PHP,如何获取标记内的链接和文本?
这是我到目前为止所做的:
$dom = new DOMDocument();
$dom->loadHTML("HTML STRING"); # not actually in code
$xpath = new DOMXPath($dom);
$query = '/html/body/code/comment()';
$divs = $dom->getElementsByTagName('div')->item(0);
$entries = $xpath->query($query, $divs);
foreach($entries as $entry) {
# shows entire text block
echo $entry->textContent;
}
如何导航以便我可以获取“c”类,然后将链接放入数组中?
编辑请注意,页面中有多个<code>
标记,因此我不能只获取具有code
属性的元素。
答案 0 :(得分:1)
您已经可以定位包含链接的评论,只需按照它进行操作并在其中进行另一个查询。例如:
$sample_markup = '<html>
<body>
<code class="hidden">
<!--
<div class="a">
<div class="b">
<div class="c">
<a href="link">Link Test 1</a>
</div>
<div class="c">
<a href="link">Link Test 2</a>
</div>
<div class="c">
<a href="link">Link Test 3</a>
</div>
</div>
</div>
-->
</code>
</body>
</html>';
$dom = new DOMDocument();
$dom->loadHTML($sample_markup); # not actually in code
$xpath = new DOMXPath($dom);
$query = '/html/body/code/comment()';
$entries = $xpath->query($query);
foreach ($entries as $key => $comment) {
$value = $comment->nodeValue;
$html_comment = new DOMDocument();
$html_comment->loadHTML($value);
$xpath_sub = new DOMXpath($html_comment);
$links = $xpath_sub->query('//div[@class="c"]/a'); // target the links!
// loop each link, do what you have to do
foreach($links as $link) {
echo $link->getAttribute('href') . '<br/>';
}
}