将html转换为url scraper

时间:2015-11-19 16:53:09

标签: php html dom scraper

所以一个非常有帮助的人帮我在Stackoverflow上做到了这一点但是我需要将他的代码从HTMl转换为URL以便我一遍又一遍地尝试并且我一直在尝试错误?

function getElementByIdAsString($html, $id, $pretty = true) {
$doc = new DOMDocument();
@$doc->loadHTML($html);

if(!$doc) {
    throw new Exception("Failed to load $url");
}
$element = $doc->getElementById($id);
if(!$element) {
    throw new Exception("An element with id $id was not found");
}

// get all object tags
$objects = $element->getElementsByTagName('object'); // return node list

// take the the value of the data attribute from the first object tag
$data = $objects->item(0)->getAttributeNode('data')->value;

// cut away the unnecessary parts and return the info
return substr($data, strpos($data, '=')+1);

}

// call it:
$finalcontent = getElementByIdAsString($html, 'mainclass');

print_r ($finalcontent);

1 个答案:

答案 0 :(得分:1)

请记住在使用函数时尝试捕获,因为它可能会抛出Exception,这将导致500 Server错误。

$finalcontent = getElementByIdAsString($html, 'mainclass');

应该成为

try {
    $finalcontent = getElementByIdAsString($html, 'mainclass');
}catch(Exception $e){
    echo $e->getMessage();
}