PHP XPath来解析表

时间:2017-02-07 03:40:43

标签: php xpath

首先是我的表HTML:

<table class="xyz">
<caption>Outcomes</caption>
<thead>
 <tr class="head">
  <th title="a" class="left" nowrap="nowrap">A1</th>
  <th title="a" class="left" nowrap="nowrap">A2</th>
  <th title="result" class="left" nowrap="nowrap">Result</th>
  <th title="margin" class="left" nowrap="nowrap">Margin</th>
  <th title="area" class="left" nowrap="nowrap">Area</th>
  <th title="date" nowrap="nowrap">Date</th>
  <th title="link" nowrap="nowrap">Link</th>
 </tr>
</thead>
<tbody>
 <tr class="data1">
  <td class="left" nowrap="nowrap">56546</td>
  <td class="left" nowrap="nowrap">75666</td>
  <td class="left" nowrap="nowrap">Lower</td>
  <td class="left" nowrap="nowrap">High</td>
  <td class="left">Area 3</td>
  <td nowrap="nowrap">Jan 2 2016</td>
  <td nowrap="nowrap">http://localhost/545436</td>
 </tr>
 <tr class="data1">
  <td class="left" nowrap="nowrap">55546</td>
  <td class="left" nowrap="nowrap">71666</td>
  <td class="left" nowrap="nowrap">Lower</td>
  <td class="left" nowrap="nowrap">High</td>
  <td class="left">Area 4</td>
  <td nowrap="nowrap">Jan 3 2016</td>
  <td nowrap="nowrap">http://localhost/545437</td>
 </tr>
 ...

此后还有更多<tr>

我正在使用这个PHP代码:

    $html = file_get_contents('http://localhost/outcomes');

    $document = new DOMDocument();
    $document->loadHTML($html);

    $xpath = new DOMXPath($document);
    $xpath->registerNamespace('', 'http://www.w3.org/1999/xhtml');
    $elements = $xpath->query("//table[@class='xyz']");

我现在如何将表作为$elements中的第一个元素,获取每个<td>的值?

理想情况下,我希望获得如下数组:

array(56546, 75666, 'Lower', 'High', 'Area 3', 'Jan 2 2016', 'http://localhost/545436'),
array(55546, 71666, 'Lower', 'High', 'Area 4', 'Jan 3 2016', 'http://localhost/545437'),
...

但我不确定如何深入挖掘表格代码。

感谢您的任何建议。

1 个答案:

答案 0 :(得分:2)

首先,获取<tbody>

中的所有表格行
$rows = $xpath->query('//table[@class="xyz"]/tbody/tr');

然后,您可以遍历该集合并查询每个<td>

foreach ($rows as $row) {
    $cells = $row->getElementsByTagName('td');
    // alt $cells = $xpath->query('td', $row)

    $cellData = [];
    foreach ($cells as $cell) {
        $cellData[] = $cell->nodeValue;
    }
    var_dump($cellData);
}