PHP表进入数组

时间:2011-09-15 14:16:11

标签: php html

我正在尝试将HTML文件中的表读入数组,我被卡住了。 任何帮助将不胜感激。

每个表元素都应存储在1个数组值

示例:$arr[1]= DER HE1 ges 1

PHP

<?php
      libxml_use_internal_errors(true);
      $i=0;
      // new dom object  
      $dom = new DOMDocument();  

      //load the html  
      $html = $dom->loadHTMLFile("106642new.html");  

      //discard white space   
      $dom->preserveWhiteSpace = false;   

      //the table by its tag name  
      $tables = $dom->getElementsByTagName('table');   

      //get all rows from the table  
      $rows = $tables->item(0)->getElementsByTagName('tr');   
      // $test = $tables->item(0)->getElementsByTagName('td');   

      // loop over the table rows  
      foreach ($rows as $row) {
          // get each column by tag name  
          $cols = $row->getElementsByTagName('td');  
          $i= $i + 1 ;
          $value = "Nummer: ".$i.":  ".$cols->item(0)->nodeValue.PHP_EOL;
          // $value = "test: ".$i.":  ".$cols->item(0)->nodeValue.PHP_EOL;
          $cols = array(1, 2, 3, 4, 5);
          echo $value;
          //  $cols[$i] = $row; 
          // echo the values    
          //echo $cols->item(0)->nodeValue ; 
      }   
?>

HTML:

<body bgcolor="#FFFFFF" topmargin="0" leftmargin="0" marginwidth="0" marginheight="0">

          <div align=left>

          <table BORDER=0 CELLSPACING=0 CELLPADDING=0 WIDTH="100%" height="100%">

          <tr><td valign="top">&nbsp</td></tr>

          <tr><td valign="top">

          <p font class="Header">Basisrooster schooljaar 2011 2012 (m.i.v. 12-09-11)</font></p>
          <br><div font class="lNameHeader"> </font> </div><table border=1>
          <tr class="AccentDark">
           <td align="left" width="65" class="tableHeader"></td>
           <td align="center" width="auto" class="tableHeader">Maandag</td>
           <td align="center" width="auto" class="tableHeader">Dinsdag</td>
           <td align="center" width="auto" class="tableHeader">Woensdag</td>
           <td align="center" width="auto" class="tableHeader">Donderdag</td>
           <td align="center" width="auto" class="tableHeader">Vrijdag</td>
          </tr><tr>
           <td align="left" width="50" class="tableHeader">1e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell"></td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">WAS</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE09</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">econ</td>
           <td align="left" width="9" class="tableCell">5</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">WIK</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC17</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">biol</td>
           <td align="left" width="9" class="tableCell">4</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">OTT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC01</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">dutl</td>
           <td align="left" width="9" class="tableCell">6</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell"></td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">2e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">KEJ</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">wisA</td>
           <td align="left" width="9" class="tableCell">3</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">BRT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE05</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">netl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">OTT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC01</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">dutl</td>
           <td align="left" width="9" class="tableCell">6</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">BAU</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HG01</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">lo</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">MET</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HD02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">entl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">3e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">WAS</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE07</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">econ</td>
           <td align="left" width="9" class="tableCell">5</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">MET</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HD02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">entl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">WAS</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE05</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">econ</td>
           <td align="left" width="9" class="tableCell">5</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">BAU</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HG01</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">lo</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">KEJ</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">wisA</td>
           <td align="left" width="9" class="tableCell">3</td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">4e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell"></td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">DER</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE08</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">ges</td>
           <td align="left" width="9" class="tableCell">1</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">KEJ</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC06</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">wisA</td>
           <td align="left" width="9" class="tableCell">3</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">DER</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE10</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">ges</td>
           <td align="left" width="9" class="tableCell">1</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">CHR</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HB15</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">ckv</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">5e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">DOC</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE09</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">m&o</td>
           <td align="left" width="9" class="tableCell">2</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell"></td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">MET</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HD02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">entl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">BRT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE05</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">netl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">OTT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC03</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">dutl</td>
           <td align="left" width="9" class="tableCell">6</td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">6e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">OTT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC03</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">dutl</td>
           <td align="left" width="9" class="tableCell">6</td>
          </tr>
          </table>
          </td>

2 个答案:

答案 0 :(得分:1)

如果认为问题是你的第一个表是其他表的容器。 如果你想获得所有表的内容,那么你也应该遍历表列表。

如果您只想获取内部表的内容,请先尝试在DOM中找到它。我建议找到第一个表,而不是在其中查找所有表元素并迭代它们。

var_dump是调试的一个很好的起点,你不需要你已经做过的任何其他事情,只需调试和测试更多:)

答案 1 :(得分:0)

我猜测它是无效的HTML / XML这一事实让你搞砸了。

您正在使用loadHTMLFile()函数,该函数可能在某种程度上支持格式错误的HTML,但它可能还需要有效的HTML / XML。

如果它需要有效的XML,那么可能发生的是“&lt; br&gt;”不会被解释为独立节点,而是被解释为节点的起点...意味着之后的所有内容都成为“&lt; br&gt;”的子节点。

此外这一行没有任何意义:

<p font class="Header">Basisrooster schooljaar 2011 2012 (m.i.v. 12-09-11)</font></p>

&lt; font&gt;标签已经过时多年,绝不应该使用,但更重要的是它不是字体标签而是p标签,它仍然会被关闭,就像它是一个字体标签一样。只是做:

<p class="Header">Basisrooster schooljaar 2011 2012 (m.i.v. 12-09-11)</p>

因此解决方案可能是您的HTML / XML无效。

(Dan Bizdadea也有一个很好的观点。)