网页抓取PHP

时间:2020-02-29 18:52:51

标签: php xpath web-scraping domxpath

我正在尝试抓取此网站:www.odds.scanner.com,但是我的代码在输出中什么也看不到。我该如何解决?

<?php
$url='http://www.odds-scanner.com/';
    libxml_use_internal_errors( true );
    $dom=new DOMDocument;
    $dom->validateOnParse=false;
    $dom->recover=true;
    $dom->strictErrorChecking=false;
    $dom->loadHTMLFile( $url );
    libxml_clear_errors();

$xp = new DOMXPath($dom);
$rows = $xp->query('//table[@class="table table-striped table-bordered"]/tr');
?>

<table>
  <tbody>
  <?php foreach ($rows as $row): ?>
    <tr>
    <?php foreach ($row->childNodes as $col): ?>
      <?php foreach ($col->childNodes as $colPart): ?>
        <?php if ($colText = trim($colPart->textContent)): ?>
        <td><?= $colText ?></td>
        <?php endif ?>
      <?php endforeach ?>
    <?php endforeach ?>
    </tr>
  <?php endforeach ?>
  </tbody>
</table>

2 个答案:

答案 0 :(得分:0)

您的XPath不返回任何内容(缺少“ /”)。尝试:

$rows = $xp->query('//table[@class='table table-striped table-bordered']//tr');

答案 1 :(得分:0)

如果这不是您想要的,我相信它应该使您足够亲近...

$rows = $xp->query("//table[@class='table table-striped table-bordered']//tr");
echo "<table><tbody>";
if (!is_null($rows)) {
  echo "<tr>";
  foreach ($rows as $row) {            
    $col = $row->childNodes;    
    foreach ($col as $colPart) {        
        $colText = trim($colPart->textContent);
        if ($colText)
        {
        echo "<td>{$colText}</td>";         
        }      
    }
  echo "</tr>";
  }
}
echo "</tbody></table>";