$urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd";
$pageContentData = file_get_contents($urlToScrap);
$doc = new DOMDocument();
$doc->loadHTML($pageContentData);
$listOfDivs = $doc->getElementsByTagName("div");
foreach ($listOfDivs as $div) {
if($div->getAttribute("class") == "doc-banner-icon"){
$img = $div->getElementsByTagName("img");
var_dump($img->getAttribute("src"));
}
}
返回空。
我在dom中有以下元素:
<div class="doc-banner-icon"><img src="somesrc"></div>
我正在尝试获取img src,因为在页面中有很多图像,我想首先获取父div,然后在其中提取图像。
解决方案在这里:
$urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd";
$pageContentData = file_get_contents($urlToScrap);
$doc = new DOMDocument();
$doc->loadHTML($pageContentData);
$listOfDivs = $doc->getElementsByTagName("div");
foreach ($listOfDivs as $div) {
if($div->getAttribute("class") == "doc-banner-icon"){
$listOfImages = $div->getElementsByTagName("img");
foreach($listOfImages as $img){
var_dump($img->getAttribute("src"));
}
}
}
答案 0 :(得分:0)
您没有遗漏任何内容,var_dump
无法按预期在DOMNodeList
上运行。试试这个:
$listOfImages = $doc->getElementsByTagName("img");
foreach ($listOfImages as $img) {
$imgClass = $img->getAttribute('class');
echo $imgClass;
}
在您更新的问题中,只需更改:
$img->getAttribute("src")
为:
$img->item(0)->getAttribute("src")
鉴于您的选择标准相当复杂,您可以考虑使用XPath而不是手动导航:
$doc = new DOMDocument();
$doc->loadHTML($pageContentData);
$xpath = new DOMXPath($doc);
$img = $xpath->query("//div[@class = 'doc-banner-icon']/img");
var_dump($img->item(0)->getAttribute('src'));