使用php dom在HTML中解析错误表

时间:2013-10-27 09:05:53

标签: php html

当我尝试使用php从html中的表中获取数据时出现此错误:

"message":"DOMDocument::loadHTML(): Misplaced DOCTYPE declaration in Entity"

php文件代码是:

$ch = curl_init(); 
curl_setopt ($ch, CURLOPT_URL, $loginUrl);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE); 
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"); 
curl_setopt ($ch, CURLOPT_TIMEOUT, 60); 
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);  
curl_setopt ($ch, CURLOPT_POSTFIELDS, $post_data); 
curl_setopt ($ch, CURLOPT_POST, 1); 
$result = curl_exec ($ch); 

if (!$result) { 
        $http_code = curl_getinfo($ch, CURLINFO_HTTP_CODE); 
        curl_close($ch); // make sure we closeany current curl sessions 
        die($http_code.' Unable to connect to server. Please come back later.'); 
    }              
curl_close($ch);   


/*** a new dom object ***/ 
    $dom = new DOMDocument; 

    /*** load the html into the object ***/ 
    $dom->loadHTML($result); 

    /*** discard white space ***/ 
    $dom->preserveWhiteSpace = false; 

    /*** the table by its tag name ***/ 
    $tables = $dom->getElementsByTagName('table'); 

    /*** get all rows from the table ***/ 
    $rows = $tables->item(0)->getElementsByTagName('tr'); 

    /*** loop over the table rows ***/ 

html页面我认为不是很完美,但我无法改变它...所以我也可以用dom来获取数据?

1 个答案:

答案 0 :(得分:1)

你必须使用:

libxml_clear_errors(); libxml_use_internal_errors($错误);