如何使用php simplexml_load_file访问td类中的值

时间:2013-07-01 02:15:30

标签: php xml simplexml

XML结构:

<channel>
<title>
</title> 
<item>
<description>
<tbody>
<tr>
<td class="chart_stock_name">NUGT</td>
<td class="chart_stock_price">5.86</td>
<td class="chart_stock_change">+1.02</td>
<td class="chart_stock_prc">+(21.07%)</td>
</tr>
</description> 
</item> 
</channel> 

代码:

$xml = simplexml_load_file($url);
for($i = 0; $i <= 6; $i++) {
$variablename = $xml->channel->item->description;
}

-

我需要获取td类中的每个值。我能够管理的最好的是回显描述,其中包含所有行。

在td类中获取值的正确方法是什么?即“NUGT”

更新(完整XML):

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="/res/preview.xsl"?>
<rss version="2.0">
  <channel>
    <title>Price &#37; Gainers</title>
    <link>http://finance.yahoo.com/gainers?e=nq</link>
    <description><![CDATA[The list of gainers by % in the NYSE]]></description>
    <lastBuildDate>Mon, 01 Jul 2013 13:46:05 GMT</lastBuildDate>
    <generator>Feed43 Proxy/1.0 (www.feed43.com)</generator>
    <ttl>360</ttl>

<item>
<guid isPermaLink="false">e5a4b37b270cb65b3039e1d7c0152a88</guid>
<pubDate>Mon, 01 Jul 2013 13:46:05 GMT</pubDate>
<title>Stock Gainers</title>
<link>http://finance.yahoo.com/q?s=REMX</link>
<description><![CDATA[<table class="chart_stock">
<thead>
<tr>
<th>Name</th>
<th>Price</th>
<th>Change</th>
<th>% Chg</th>
</tr>
</thead>
<tbody><tr><td class="chart_stock_name">REMX</td><td class="chart_stock_price">39.00</td><td class="chart_stock_change">+29.51</td><td class="chart_stock_prc">+(310.96%)</td></tr><tr><td class="chart_stock_name">GDXJ</td><td class="chart_stock_price">36.99</td><td class="chart_stock_change">+27.83</td><td class="chart_stock_prc">+(303.82%)</td></tr><tr><td class="chart_stock_name">GEX</td><td class="chart_stock_price">51.45</td><td class="chart_stock_change">+36.23</td><td class="chart_stock_prc">+(238.04%)</td></tr><tr><td class="chart_stock_name">LVB</td><td class="chart_stock_price">35.04</td><td class="chart_stock_change">+4.61</td><td class="chart_stock_prc">+(15.15%)</td></tr><tr><td class="chart_stock_name">NUGT</td><td class="chart_stock_price">6.36</td><td class="chart_stock_change">+0.50</td><td class="chart_stock_prc">+(8.48%)</td></tr><tr><td class="chart_stock_name">FWDI</td><td class="chart_stock_price">23.88</td><td class="chart_stock_change">+0.00</td><td class="chart_stock_prc">+(0.00%)</td></tr><tr><td class="chart_stock_name">LAS</td><td class="chart_stock_price">5.11</td><td class="chart_stock_change">+0.31</td><td class="chart_stock_prc">+(6.46%)</td></tr><tr><td class="chart_stock_name">WWAV-B</td><td class="chart_stock_price">16.10</td><td class="chart_stock_change">+0.90</td><td class="chart_stock_prc">+(5.92%)</td></tr><tr><td class="chart_stock_name">LPLT</td><td class="chart_stock_price">31.78</td><td class="chart_stock_change">+0.00</td><td class="chart_stock_prc">+(0.00%)</td></tr><tr><td class="chart_stock_name">P</td><td class="chart_stock_price">19.43</td><td class="chart_stock_change">+1.03</td><td class="chart_stock_prc">+(5.59%)</td></tr><tr><td class="chart_stock_name">MUX</td><td class="chart_stock_price">1.78</td><td class="chart_stock_change">+0.10</td><td class="chart_stock_prc">+(5.95%)</td></tr><tr><td class="chart_stock_name">NOK</td><td class="chart_stock_price">3.94</td><td class="chart_stock_change">+0.20</td><td class="chart_stock_prc">+(5.21%)</td></tr><tr><td class="chart_stock_name">AUQ</td><td class="chart_stock_price">4.59</td><td class="chart_stock_change">+0.22</td><td class="chart_stock_prc">+(5.03%)</td></tr><tr><td class="chart_stock_name">UWTI</td><td class="chart_stock_price">30.81</td><td class="chart_stock_change">+1.44</td><td class="chart_stock_prc">+(4.90%)</td></tr><tr><td class="chart_stock_name">DRD</td><td class="chart_stock_price">5.69</td><td class="chart_stock_change">+0.26</td><td class="chart_stock_prc">+(4.79%)</td></tr><tr><td class="chart_stock_name">AUO</td><td class="chart_stock_price">3.62</td><td class="chart_stock_change">+0.16</td><td class="chart_stock_prc">+(4.62%)</td></tr><tr><td class="chart_stock_name">SLVP</td><td class="chart_stock_price">11.88</td><td class="chart_stock_change">+0.52</td><td class="chart_stock_prc">+(4.58%)</td></tr><tr><td class="chart_stock_name">IAG</td><td class="chart_stock_price">4.38</td><td class="chart_stock_change">+0.18</td><td class="chart_stock_prc">+(4.16%)</td></tr><tr><td class="chart_stock_name">EGO</td><td class="chart_stock_price">6.42</td><td class="chart_stock_change">+0.24</td><td class="chart_stock_prc">+(3.88%)</td></tr><tr><td class="chart_stock_name">GNK</td><td class="chart_stock_price">1.63</td><td class="chart_stock_change">+0.00</td><td class="chart_stock_prc">+(0.00%)</td></tr><tr><td class="chart_stock_name">TKC</td><td class="chart_stock_price">14.98</td><td class="chart_stock_change">+0.61</td><td class="chart_stock_prc">+(4.24%)</td></tr><tr><td class="chart_stock_name">TAHO</td><td class="chart_stock_price">14.73</td><td class="chart_stock_change">+0.58</td><td class="chart_stock_prc">+(4.10%)</td></tr><tr><td class="chart_stock_name">BALT</td><td class="chart_stock_price">3.86</td><td class="chart_stock_change">+0.15</td><td class="chart_stock_prc">+(4.04%)</td></tr><tr><td class="chart_stock_name">TGEM</td><td class="chart_stock_price">19.01</td><td class="chart_stock_change">+0.72</td><td class="chart_stock_prc">+(3.94%)</td></tr><tr><td class="chart_stock_name">MMD</td><td class="chart_stock_price">18.69</td><td class="chart_stock_change">+0.70</td><td class="chart_stock_prc">+(3.89%)</td></tr></tbody>
</table>]]></description>
</item>


  </channel>
</rss>

1 个答案:

答案 0 :(得分:2)

我首先要注意的是,您的XML并不完全有效,因为<tbody>还缺少结束标记。

一种简单的方法是使用XPath查询,该查询返回具有包含<td>的类的chart_stock个节点。从那里,你可以遍历它们并检索每个节点值,构造一个数组,其键是chart_stock_*类,值是相应的节点值。

这里的内容发生在XPath查询中。

  • //td选择所有<td>个节点...
  • contains(@class, "chart_stock") ...在其class属性中有“chart_stock”。

// Load your file
$xml = simplexml_load_file($url);
// Get all the <td> nodes via xpath, 
// only those containing chart_stock in the class
$tds = $xml->xpath('//td[contains(@class, "chart_stock")]');

// An array to hold your values...
$output = array();

// Loop over them and build an array of key => value pairs
// based on the class attribute
foreach ($tds as $td) {
  $attr = $td->attributes();
  // Cast attribute and node as strings and assign to your array
  $class = (string)$attr['class'];
  $output[$class] = (string)$td;
}

print_r($output);
Array
(
    [chart_stock_name] => NUGT
    [chart_stock_price] => 5.86
    [chart_stock_change] => +1.02
    [chart_stock_prc] => +(21.07%)
)

在看到真实的 XML之后,看起来这与您隐含的原始示例有很大不同。 <description>个节点均包含CDATA个HTML块。 SimpleXML与HTML不太相似,而DOMDocument在那里更强大。使用DOMDocument,检索<description>节点,然后将其内容作为HTML加载。使用getElementsByTagName等DOM API调用,遍历<tr><td>并将行加载到$output,为每个<tr>添加一个新的子数组。

// An array to hold your values...
$output = array();

// DOMDocument for the outer XML    
$maindom = new DOMDocument();
$maindom->loadXML($xmltext);

// Loop over description nodes
$desc = $maindom->getElementsByTagName('description');

foreach ($desc as $d) {
    // get the cdata block
    $cdata = $d->nodeValue; 
    // and load it as HTML into DOMDocument
    $dom = new DOMDocument();
    $dom->loadHTML($cdata);
    // Get its descendant <tr>
    $trs = $dom->getElementsByTagName("tr");

    // Loop over each <tr> and get ids child <td> to retrieve your values
    foreach ($trs as $tr) {
        // New output sub-array per tr
        $row = array();
        $tds = $tr->getElementsByTagName('td');
        foreach ($tds as $td) {
                // Load each <td> onto the current row array by class
            $class = $td->getAttribute('class');
            $row[$class] = $td->nodeValue;
        }
        // Append to $output
        $output[] = $row;
    }
}
echo '<pre>';
print_r($output);
echo '</pre>';

Here is the whole thing in action