我需要解析以下代码
<ul class="zg_hrsr">
<li class="zg_hrsr_item">
<span class="zg_hrsr_rank">#15</span>
<span class="zg_hrsr_ladder">
in
<a href="http://www.amazon.com/gp/bestsellers/digital-text/ref=pd_zg_hrsr_kstore_1_1">Kindle Store</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/154606011/ref=pd_zg_hrsr_kstore_1_2">Kindle eBooks</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/157325011/ref=pd_zg_hrsr_kstore_1_3">Nonfiction</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/292975011/ref=pd_zg_hrsr_kstore_1_4">Lifestyle & Home</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156699011/ref=pd_zg_hrsr_kstore_1_5">Home & Garden</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156828011/ref=pd_zg_hrsr_kstore_1_6">Gardening & Horticulture</a>
>
<b>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156847011/ref=pd_zg_hrsr_kstore_1_7_last">Greenhouses</a>
</b>
</span>
</li>
<li class="zg_hrsr_item">
<span class="zg_hrsr_rank">#26</span>
<span class="zg_hrsr_ladder">
in
<a href="http://www.amazon.com/gp/bestsellers/digital-text/ref=pd_zg_hrsr_kstore_2_1">Kindle Store</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/154606011/ref=pd_zg_hrsr_kstore_2_2">Kindle eBooks</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/157325011/ref=pd_zg_hrsr_kstore_2_3">Nonfiction</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/292975011/ref=pd_zg_hrsr_kstore_2_4">Lifestyle & Home</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156699011/ref=pd_zg_hrsr_kstore_2_5">Home & Garden</a>
>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156828011/ref=pd_zg_hrsr_kstore_2_6">Gardening & Horticulture</a>
>
<b>
<a href="http://www.amazon.com/gp/bestsellers/digital-text/156849011/ref=pd_zg_hrsr_kstore_2_7_last">House Plants</a>
</b>
</span>
</li>
</ul>
,我想要的输出是,
卖家排名:#266,715在Kindle商店支付(参见前100名付费) Kindle商店) Kindle商店中的#15&gt; Kindle电子书&gt; <非虚构类>生活方式与主页&gt;家庭&amp;花园&gt;园艺和园艺&gt;温室 Kindle商店中的#26&gt; Kindle电子书&gt; <非虚构类>生活方式与主页&gt;家庭&amp;花园&gt;园艺和园艺&gt;室内植物
我怎样才能做到这一点?我所知道的是,我应该为每个'a'标签获取'nodeValue',但我很困惑,以我所需的格式获取它们, 我想我应该使用数组,但我无法实现它,因为我的专业水平很低..
指南和帮助请。我只需要xPath和数组的结构(如果可以使用数组完成)或者替换数组..
答案 0 :(得分:0)
//create XPath from you DOM object:
$xpath = new DOMXPath($dom);
foreach($xpath->query("//span[@class='zg_hrsr_rank']") as $rank){
$rank = $rank->textContent;
$trail = array();
foreach($xpath->query('//a',$rank) as $step){
$trail[] = $step->textContent;
}
echo $rank.' '.implode(' > ',$trail)."\n";
}