Question

从this url开始，我想要了解HTML表格，特别是这个元素：

<td class="tbl_black_n_1" nowrap="">
<a href="popup.asp?tp=2100&amp;lang=en&amp;idm=553759" target="_blank"><img src="http://www.betonews.com//img/i_betfair.gif" width="12" height="10" border="0" alt=""></a>
<a href="popup.asp?tp=2110&amp;lang=en&amp;idm=553759" target="_blank"><img src="http://www.betonews.com//img/i_history.gif" width="12" height="10" border="0" alt=""></a>
</td>

以相同的方式构造了一百多个<tr>，其中包含大量<td>我设法循环使用xpath将所有数据存储在数据库中，除了一个：最后{{1}元素..我想要＆＃34; href＆＃34;第一个<td>的属性。所以，在我的例子中：

＆＃34; popup.asp TP = 2100＆安培;朗= EN＆安培; IDM = 553759＆＃34;

但是当我运行我的查询时，id变量检索一个NULL值。帮助！

这是我的PHP代码，但我无法继续...

<a>

@LarsH我使用这个PHP代码来检索你所问的内容，结果是NULL

<?php
$url = 'http://www.betonews.com/table.asp?tp=2001&lang=en&dd=28&dm=7&dy=2014&df=1&dw=3';
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);

curl_close($ch);
$document = new DOMDocument();
$document->loadHTML($response);


$xpath = new DOMXPath($document);
$expression = '(//table[@cellpadding="3"])[1]/tr[position() > 1]';
$rows = $xpath->query($expression);

results = array();

foreach ($rows as $row) {
  $result = array();
  $td = $row->childNodes;
  $id = $td->item(36)->childNodes->item(1)->attributes->getNamedItem("href")->nodeValue‌;
  $result["id"] = $id;
  $results[] = $result;
  }
  var_dump($results);

这是$expression = '(//table[@cellpadding="3"])[1]/tr[position() > 1]'; $rows = $xpath->query($expression); $results = array(); foreach ($rows as $row) { $td = $row->childNodes; $ok = $td->item(36)->childNodes->item(1)->nodetype; echo $ok; }的值，使用您建议的上一个表达式！

$row

哇！我们能够看到自己的价值！所以..如何肯定检索它！？谢谢

编辑：Yeesss！我终于明白了！我用

{  
   [  
      0
   ]   => array(1)   {  
      [  
         "ok"
      ]      => object(DOMAttr)#3 (21)      {  
         [  
            "name"
         ]         => string(4) "href"         [  
            "specified"
         ]         => bool(true)         [  
            "value"
         ]         => string(36) "popup.asp?tp=2100&lang=en&idm=556296"         [  
            "ownerElement"
         ]         => string(22) "(object value omitted)"         [  
            "schemaTypeInfo"
         ]         => NULL         [  
            "nodeName"
         ]         => string(4) "href"         [  
            "nodeValue"
         ]         => string(36) "popup.asp?tp=2100&lang=en&idm=556296"         [  
            "nodeType"
         ]         => int(2)         [  
            "parentNode"
         ]         => string(22) "(object value omitted)"         [  
            "childNodes"
         ]         => string(22) "(object value omitted)"         [  
            "firstChild"
         ]         => string(22) "(object value omitted)"         [  
            "lastChild"
         ]         => string(22) "(object value omitted)"         [  
            "previousSibling"
         ]         => NULL         [  
            "nextSibling"
         ]         => string(22) "(object value omitted)"         [  
            "attributes"
         ]         => NULL         [  
            "ownerDocument"
         ]         => string(22) "(object value omitted)"         [  
            "namespaceURI"
         ]         => NULL         [  
            "prefix"
         ]         => string(0) ""         [  
            "localName"
         ]         => string(4) "href"         [  
            "baseURI"
         ]         => NULL         [  
            "textContent"
         ]         => string(36) "popup.asp?tp=2100&lang=en&idm=556296"
      }
   }

！谢谢谢谢@LarsH

Answer 1

问题可能是<td>的第一个子节点实际上是一个文本节点，仅由空格组成。您可以通过查看nodetype：

来测试该假设

$td->item(36)->childNodes->item(1)->nodetype

要解决此问题，您可以在XPath中尝试更多导航，例如

(//table[@cellpadding="3"])[1]/tr[position() > 1]/td[36]/a[1]/@href

然后循环遍历这些结果：

$expression = '(//table[@cellpadding="3"])[1]/tr[position() > 1]/td[19]/a[1]/@href';
$ids = $xpath->query($expression);

results = array();

foreach ($ids as $idNode) {
  $result = array();
  $result["id"] = $idNode->nodeValue;
  $results[] = $result;
}
var_dump($results);

使用Xpath检索HTML表

1 个答案: