我想获取页面上的所有链接,这样我就可以获得像href等标题的属性......
<?php
function exception_handler($exception) {
echo "Uncaught exception: " , $exception->getMessage(), "\n";
}
set_exception_handler('exception_handler');
function dom_create()
{
echo("domcreate");
$file = file_get_html('http://www.facebook.com/plugins/fan.php?connections=100&id=40796308305');
echo($file);
$doc = new DOMDocument();
$doc->loadHTMLFile($file);
$xpath = new DOMXpath($doc);
$elements = $xpath->query("//*[@id]");
if (!is_null($elements)) {
foreach($elements as $e){
$documentLinks = $e->getElementsByTagName('a');
}
else
echo "NULL";
}
}
dom_create();
?>
即使我只设置了echo语句,我也没有得到任何输出。 有人有想法吗?
答案 0 :(得分:0)
你的牙套都错了:
if (!is_null($elements)) {
foreach($elements as $e){
$documentLinks = $e->getElementsByTagName('a');
// perhaps add echo here if you want to output the links somehow
}
} else {
echo "NULL";
}
答案 1 :(得分:0)
我通过get_contents来解决它并给它一个上下文。
<?php
function exception_handler($exception) {
echo "Uncaught exception: " , $exception->getMessage(), "\n";
}
set_exception_handler('exception_handler');
function dom_create()
{
$context = stream_context_create(array('http' => array('header' => 'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0')));
$file = file_get_contents('http://www.facebook.com/plugins/fan.php?connections=100&id=6568341043637',false, $context);
$dom = new DOMDocument;
$dom->loadHTML($file);
foreach ($dom->getElementsByTagName('a') as $node) {
echo $dom->saveHtml($node), PHP_EOL;
}
}
dom_create();
?>