我正在尝试创建一个try / catch循环,用于从其他网站下载HTML:
foreach($intldes as $id) {
$html = HtmlDomParser::file_get_html('https://nssdc.gsfc.nasa.gov/nmc/spacecraftDisplay.do?id='.$id);
foreach($html->find('#rightcontent') as $id);
foreach($html->find('.urone p') as $element);
foreach($html->find('.urtwo') as $launchdata);
}
如果数据存在,则会生成以下HTML:
<p><strong>NSSDCA/COSPAR ID:</strong> 2009-038F</p>
<p>ANDE 2, the Atmospheric Neutral Density Experiment 2, is a pair of microsatellites (Castor and Pollux) launched from Cape Canaveral on STS 127 on 15 July 2009 at 22:03 UT and deployed from the payload bay of the shuttle on 30 July 2009 at 17:22 UT.</p>
<p><strong>Launch Date:</strong> 2009-07-15<br/><strong>Launch Vehicle:</strong> Shuttle<br/><strong>Launch Site:</strong> Cape Canaveral, United States<br/></p>
如果数据不存在,我会收到Undefined variable: element
错误,这意味着DOM Parser无法找到我想要显示的HTML。
所以我需要一些能够跳过没有所需HTML或返回NULL变量的网页的东西。
基本上,如果我想要的HTML或变量$element
不存在,我希望Guzzle跳过该网页而不加载它。
修改
我的全部功能:
public function tester() {
$intldes = DB::table('examples')->pluck('id');
foreach ($intldes as $query) {
$html = HtmlDomParser::file_get_html('https://example.com?id='.$query);
$elements = $html->find('.urone p', 0);
if (is_array($elements)) {
foreach($html->find('#rightcontent') as $rawid);
foreach($html->find('.urone p') as $rawdescription);
foreach($html->find('.urtwo') as $launchdata);
//-- Data Parser --//
//Intldes
$intldesgetter = strip_tags($rawid->first_child()->next_sibling()->next_sibling()); //Get Element and Remove Tags
$intldesformat = substr($intldesgetter, ($pos = strpos($intldesgetter, ':')) !== false ? $pos + 3 : 0); //Remove Title
$dbintldes = ltrim($intldesformat); //Remove Blank-space
//Description
$description = strip_tags($rawdescription);
$dbdescription = ltrim($description);
//Launch Data
$launchdate = $launchdata->first_child()->next_sibling()->next_sibling()->next_sibling();
$explode = explode("<br/>", $launchdate);
$newArray = array_map(function($v){
return trim(strip_tags($v));
}, $explode);
$dblaunchdate = substr($newArray[0], ($pos = strpos($newArray[0], ':')) !== false ? $pos + 3 : 0);
$dblaunchvehicle = substr($newArray[1], ($pos = strpos($newArray[1], ':')) !== false ? $pos + 3 : 0);
$dblaunchsite = substr($newArray[2], ($pos = strpos($newArray[2], ':')) !== false ? $pos + 3 : 0);
//Data Saver
DB::table('descriptions')->insert(
['intldes' => $dbintldes, 'description' => strip_tags($dbdescription), 'launch_date' => $dblaunchdate, 'launch_vehicle' => $dblaunchvehicle, 'launch_site' => $dblaunchsite]
);
echo "Success";
} else {
echo "$query does not exist";
continue;
};
}
}
答案 0 :(得分:0)
我认为您的代码中出现错误:
foreach($html->find('.urone p') as $element);
根据我的经验,我建议您在迭代 foreach 循环之前先检查 HTML标记的可用性。
您可以使用is_object()
或is_array()
来解决问题。搜索单个元素时,将返回一个对象。搜索一组元素时,将返回一个对象数组。
在搜索元素集时,可以使用
$elements = $html->find('.urone p');
if (is_array($elements)) {
//continue
}