foreach xml节点返回所选元素

时间:2014-09-30 22:23:11

标签: php html xml simplexml domdocument

如何从xml cdata标签中仅获取选定的值?

到目前为止,在stackoverflow的帮助下,我可以获得字符串中的所有<b>标签

$result = simplexml_load_file($url, 'SimpleXMLElement', LIBXML_NOCDATA);

    foreach ($result->channel->item as $item) {
        $desc = $item->description;
        $dom = new DOMDocument($desc);
        $dom->loadHTML($desc);
        $bold_tags = $dom->getElementsByTagName('b');
        foreach($bold_tags as $b) {
            echo $b->nodeValue . "<br>";
        }

但它会回显<b>内的所有数据,但我只想说价格。 我在stackoverflow中使用->item(x)来获取该值,但没有任何效果。如果我将其设置为echo $b->nodeValue->item(2) . "<br>";echo $b->item(2)->nodeValue . "<br>";。那么我应该把它放在哪里,或者我应该使用什么来获得只有<b>元素的价格。价格总是在同一个地方。

以下是Feed中的CDATA:

<a href="//www.ss.lv/msg/lv/real-estate/flats/riga/purvciems/deblb.html">
    <img align="right" border="0" src="//i.ss.lv/images/2014-10-01/349288/VHkAHkBlRlo=/1.t.jpg" width="160" height="120" alt="">
</a> District: <b><b>Purvciems</b></b><br />
Street: <b><b>Dudajeva g. 12</b></b><br />
Rooms: <b><b>2</b></b><br />
m2: <b><b>50</b></b><br />
Type: <b><b>LT proj.</b></b><br />
: <b><b>3</b> €</b><br />
Price: <b><b>150</b> €/mēn.</b><br />
<br />
<b><a href="//www.ss.lv/msg/lv/real-estate/flats/riga/purvciems/deblb.html">Apskatīt sludinājumu</a></b><br />
<br />
]]>

1 个答案:

答案 0 :(得分:1)

您可以尝试使用此方法来解析这些价格:

$url = "http://www.ss.lv/lv/real-estate/flats/riga/hand_over/rss/";
$result = simplexml_load_file($url, 'SimpleXMLElement', LIBXML_NOCDATA);

$data = array();
foreach($result->channel->item as $item) {
    $temp = array();

    $title = (string) trim($item->title);
    $desc = $item->description;

    $temp['title'] = $title;

    $dom = new DOMDocument('1.0', 'utf-8');
    $desc = mb_convert_encoding($desc, 'HTML-ENTITIES', "UTF-8");
    $dom->loadHTML($desc);
    $xpath = new DOMXpath($dom);
    $price_tag = $xpath->query('//text()[contains(., "Cena")]'); // target Cena,
    // i didn't know this was PRICE in translation haha
    $price = $price_tag->item(0)->nextSibling->nodeValue;
    $temp['price'] = $price;
    $data[] = $temp ;
}

echo '<pre>';
print_r($data);

好的,解释如下:

所以目标是让那些价格位于CDATA内的<description>标签中。

因此每个<item>节点都包含它们,如下所示:

<a href="//www.ss.lv/msg/lv/real-estate/flats/riga/centre/colfo.html">
    <img align=right border=0 src="//i.ss.lv/images/2014-08-25/346391/VHkPH0FiQVo=/1.t.jpg" width="160" height="120" alt="">
</a>
Rajons: <b>centrs</b>
<br/>Iela: <b>Rūpniecības 7</b><br/>Ist.: <b>4</b>
<br/>m2: <b>145</b><br/>Sērija: <b>Renov.</b><br/>: <b>10.34 €</b>
<br/>Cena: <b>1,500 €/mēn.</b><br/>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ // this one
<br/><b><a href="//www.ss.lv/msg/lv/real-estate/flats/riga/centre/colfo.html">Apskatīt sludinājumu</a></b><br/><br/>

所以目标是通过使用xpath搜索价格(Cena)。所以根据标记这是一个普通的文本节点(不是元素或不是标签)。

因此我们定位包含&#34; Cena&#34;:

的文本元素
//text()[contains(., "Cena")]

因此,每个Cena / Price都有下一个兄弟<b>标签,其中包含该特定值,因此我们定位每个Cena / Price并指向下一个兄弟<b>标签

item(0)->nextSibling->nodeValue
Cena/Price -> nextSibling (which is b tag) -> its value