如何使用PHP DOM查询从HTML表中选择文本?

时间:2016-04-19 05:37:45

标签: php html dom web-scraping

如何使用PHP DOM查询从HTML表格单元格中获取文本?

HTML表格是:

<table>
  <tr>
    <th>Job Location:</th>
    <td><a href="/#">Kabul</a>
    </td>
  </tr>
  <tr>
    <th>Nationality:</th>
    <td>Afghan</td>
  </tr>
  <tr>
    <th>Category:</th>
    <td>Program</td>
  </tr>
</table>

我有以下查询,但它不起作用:

$xmlPageDom = new DomDocument();
@$xmlPageDom->loadHTML($html);
$xmlPageXPath = new DOMXPath($xmlPageDom);
$value = $xmlPageXPath->query('//table td /text()');

1 个答案:

答案 0 :(得分:2)

get a complete table with php domdocument and print it

答案是这样的:

$html = "<table ID='myid'><tr><td>1</td><td>2</td></tr><tr><td>4</td><td>5</td></tr><tr><td>7</td><td>8</td></tr></table>";

$xml = new DOMDocument();
$xml->validateOnParse = true;
$xml->loadHTML($html);

$xpath = new DOMXPath($xml);
$table =$xpath->query("//*[@id='myid']")->item(0);

$rows = $table->getElementsByTagName("tr");

foreach ($rows as $row) {
    $cells = $row -> getElementsByTagName('td');
    foreach ($cells as $cell) {
        print $cell->nodeValue;
    }
}

编辑:改为使用

$table = $xpath->query("//table")->item(0);