循环通过内部文章xpath

时间:2017-05-03 15:24:03

标签: php xpath

我无法在xpath中循环浏览此HTML,如下所示。目标是循环遍历article元素,然后再遍及内部article元素。 我认为问题是我对内部文章元素的内部查询

我有一个HTML如下:

 <div id="content">
   <article>
        <article>
           <div>
             <h2><a href="hrefvalue">Something Awesome</a></h2>
           </div>
        </article>
        <article>
           <div>
             <h2><a href="hrefvalue">Something Awesome</a></h2>
           </div>
        </article>
        <article>
           <div>
             <h2><a href="hrefvalue">Something Awesome</a></h2>
           </div>
        </article>
   </article>
   <article>
        <article>
           <div>
             <h2><a href="hrefvalue">Something Awesome2</a></h2>
           </div>
        </article>
        <article>
           <div>
             <h2><a href="hrefvalue">Something Awesome2</a></h2>
           </div>
        </article>
        <article>
           <div>
             <h2><a href="hrefvalue">Something Awesome2</a></h2>
           </div>
        </article>
   </article>
</div>

我的xpath代码如下:

$articlesxpath = $xpath->query('//*[@id="content"]/article');
foreach($articlesxpath as $item){
  $items = $item->query('./article');
  foreach($items as $ix){
   var_dump($ix);
  }

}

正如你所看到的,我试图遍历文章然后在那个元素里面内部文章元素。 目标是从内部文章元素中获取信息

不确定我的代码有什么问题。

2 个答案:

答案 0 :(得分:0)

要缩小xpath查询,请将第二个参数传递给query

$articlesxpath = $xpath->query('//*[@id="content"]/article');
foreach($articlesxpath as $item){
    // search in $item node
    $items = $xpath->query('article', $item);
    foreach($items as $ix) {
        var_dump($ix->nodeValue);
    }
}

或者简单地说:

$articlesxpath = $xpath->query('//*[@id="content"]/article/article');
foreach($articlesxpath as $item){
    var_dump($item->nodeValue);
}

要获得a的href(以及article),您应该创建一个正确的查询

$articlesxpath = $xpath->query('//*[@id="content"]/article/article/div/h2/a');
foreach($articlesxpath as $item){
    var_dump($item->getAttribute('href'));
}

此处为manual,请确保您熟悉该文字。

答案 1 :(得分:0)

如果你有固定的html模板,那么你可以像这样查询//div[@id="content"]/article/article/div/h2/a

XPath查询: <?php $string = '<html><body><div id="content"> <article> <article> <div> <h2><a href="hrefvalue">Something Awesome</a></h2> </div> </article> <article> <div> <h2><a href="hrefvalue">Something Awesome</a></h2> </div> </article> <article> <div> <h2><a href="hrefvalue">Something Awesome</a></h2> </div> </article> </article> <article> <article> <div> <h2><a href="hrefvalue">Something Awesome2</a></h2> </div> </article> <article> <div> <h2><a href="hrefvalue">Something Awesome2</a></h2> </div> </article> <article> <div> <h2><a href="hrefvalue">Something Awesome2</a></h2> </div> </article> </article> </div></body></html>'; $obj = new DOMDocument(); $obj->loadHTML($string); $xpath = new DOMXPath($obj); $articlesxpath = $xpath->query('//div[@id="content"]/article/article/div/h2/a'); foreach ($articlesxpath as $item) { print_r($item->getAttribute("href")); echo PHP_EOL; }

hrefvalue
hrefvalue
hrefvalue
hrefvalue
hrefvalue
hrefvalue

<强>输出:

a=$(ssh user@server "/route/to/script.sh")