PHP DOMXPath循环搜索并查找子div值

时间:2018-01-03 22:00:20

标签: php domdocument domxpath

我正在将外部html内容加载到这样的变量中:

$content = file_get_contents('http://localhost');

页面有一组这样的循环:

<ul class="items-list">
<li>Title1</li>
<li>Description1</li>
<li>Location1</li>
</ul>
<!-- OTHER CONTENT HERE BETWEEN THE UL AND THE PRICE DIV -->
<a href="#">
<div class="item-price">£10</div>
<a/>

<ul class="items-list">
<li>Title2</li>
<li>Description2</li>
<li>Location2</li>
</ul>
<!-- OTHER CONTENT HERE BETWEEN THE UL AND THE PRICE DIV -->
<a href="#">
<div class="item-price">£15</div>
</a>

<ul class="items-list">
<li>Title3</li>
<li>Description3</li>
<li>Location3</li>
</ul>
<!-- OTHER CONTENT HERE BETWEEN THE UL AND THE PRICE DIV -->
<a href="#">
<div class="item-price">£20</div>
</a>

<ul class="items-list">
<li>Title4</li>
<li>Description4</li>
<li>Location4</li>
</ul>
<!-- OTHER CONTENT HERE BETWEEN THE UL AND THE PRICE DIV -->
<a href="#">
<div class="item-price">£25</div>
</a>

我有以下代码使用DOMXPath搜索所有项目 - 列出UL,然后我可以遍历它并回显它。

$dom = new DomDocument();
$dom->loadHTML($content);
$xpath = new DOMXPath($dom); 
$items = $xpath->query("//ul[@class='items-list']"); 

foreach ($items as $node) { 
  echo $node->textContent;
}

这项工作非常完美。但是,我需要帮助显示这些循环中每个循环的价格,这些循环来自名为item-price的div类,它在UL之后但不是紧接在之后。

我有什么想法可以做到这一点吗?

3 个答案:

答案 0 :(得分:0)

使用以下兄弟轴

$xpath->query("//ul[@class='items-list']/following-sibling::div[@class='item-price']"); 

答案 1 :(得分:0)

使用原始查询结合following-sibling运算符也许就足够了。

define('BR','<br />');

$strhtml='<ul class="items-list">
    <li>Title1</li>
    <li>Description1</li>
    <li>Location1</li>
    </ul>
    <!-- OTHER CONTENT HERE BETWEEN THE UL AND THE PRICE DIV -->
    <div class="item-price">£10</div>

    <ul class="items-list">
    <li>Title2</li>
    <li>Description2</li>
    <li>Location2</li>
    </ul>
    <!-- OTHER CONTENT HERE BETWEEN THE UL AND THE PRICE DIV -->
    <div class="item-price">£15</div>

    <ul class="items-list">
    <li>Title3</li>
    <li>Description3</li>
    <li>Location3</li>
    </ul>
    <!-- OTHER CONTENT HERE BETWEEN THE UL AND THE PRICE DIV -->
    <div class="item-price">£20</div>

    <ul class="items-list">
    <li>Title4</li>
    <li>Description4</li>
    <li>Location4</li>
    </ul>
    <!-- OTHER CONTENT HERE BETWEEN THE UL AND THE PRICE DIV -->
    <div class="item-price">£25</div>';


    $dom = new DomDocument();
    $dom->loadHTML( $strhtml );
    $xpath = new DOMXPath( $dom ); 
    $items = $xpath->query("//ul[@class='items-list'] | //ul[@class='items-list']/following-sibling::div[@class='item-price']"); 
    if( $items && $items->length > 0 ){
        foreach ( $items as $node ) { 
            echo $node->textContent . BR;
        }
    }

以上输出

Title1 Description1 Location1 
£10
Title2 Description2 Location2 
£15
Title3 Description3 Location3 
£20
Title4 Description4 Location4 
£25

鉴于对html内容的更改,需要对XPath查询进行微小修改,因为包含价格的div不再是直接兄弟 - 尽管它可能是。

$items = $xpath->query("//ul[@class='items-list'] | //ul[@class='items-list']/following::div[@class='item-price']");

答案 2 :(得分:0)

foreach ($items as $node) { 
  echo $node->textContent;
  $div = $xpath->query('.//following::div[@class="item-price"][1]', $node); 
  echo $div[0]->nodeValue ."\n\n";
}

demo