我想解析一个网页,但我遇到了麻烦:
我最终有这个错误:
解析错误:语法错误,意外'Steve'(T_STRING)
<?php
// Here I would like to parse a wikipedia page
$url = "http://fr.wikipedia.org/wiki/Jobs_(film)";
$html = file_get_contents($url);
$doc = new DOMDocument();
$doc->loadHTML($html);
// I'm use xPath for parsing page
$xpath = new DOMXpath($doc);
// Here I save all links of the wikipedia page
$nodes = $xpath->query('//a');
?>
<?php
// Showing elements :
if($nodes)
{
echo '<h1>les <span class="red">'.$nodes->length. '</span> liens de la page : '.$url.'</h1>';
// Table to show some elements
echo '<table>
<thead><tr><th>ancre</th><th>title</th><th>url</th><th>rel</th></tr></thead><tbody>';
// Here I search all the elements with title, links...
foreach($nodes as $node) {
if($node->getAttribute('rel')){$rel = $node->getAttribute('rel');}else{$rel= "-";}
if($node->getAttribute('title')){$title = $node->getAttribute('title');}else{$title= "-";}
if($node->nodeValue){$ancre = $node->nodeValue;}else{$rel= "-";}
if($node->getAttribute('href[contains(text(),'Steve')]')){$href = $node->getAttribute('href[contains(text(),'Steve')]');}else{$rel= "-";}
// The table contains all element but i would like a filter...
echo '<tr><td>'. $ancre .'</td><td>'. $title .'</td><td>'. $href .'</td><td>'.$rel.'</td></tr>';
}
echo '</tbody></table>';
}
答案 0 :(得分:1)
它的报价问题
if($node->getAttribute("href[contains(text(),'Steve')]")){$href = $node->getAttribute("href[contains(text(),'Steve')]");}else{$rel= "-";}
答案 1 :(得分:1)
你的报价错了。 -
if($node->getAttribute("href[contains(text(),'Steve')]")){$href = $node->getAttribute("href[contains(text(),'Steve')]");}else{$rel= "-";}
^ ^ ^ ^
答案 2 :(得分:1)
此行中的单引号内有单引号:
if($node->getAttribute('href[contains(text(),'Steve')]'))
要么逃避它们(\'
)要么用双引号替换("
)。