我正在尝试获取div和其他页面,试图将其放在foreach中。 但面对一些麻烦,
<div class="article_info">
<ul class="c-result_box">
<li>
<div class="inner cf">
<div class="c-header">
<div class="c-logo">
<im src="/e/designs/31sumai/common/img/logo_08.png" alt="#">
</div>
<p class="c-supplier">三井のマンション</p>
<p class="c-name">
<a href="https://www.31sumai.com/mfr/K1503/" class="link" target="_blank">パークリュクス大阪天満</a>
</p>
我正在尝试在<a>
元素中获取文本,这是我的代码,我在这里缺少什么?
$start_id = 1501;
while(true){
$url = 'https://www.31sumai.com/mfr/K'.$start_id.'/outline.html';
$html = file_get_contents($url);
libxml_use_internal_errors(true);
$DOMParser = new \DOMDocument();
$DOMParser->loadHTML($html);
$xpath = new \DOMXPath($DOMParser);
$classname="c-name";
$nodes = $finder->query("//*[contains(@class, '$classname')]");
$MyTable = false;
$insertData = [];
foreach($nodes as $node){
$allNames = [];
foreach($node->getElementsByTagName('a') as $a){
$name = $a->getElementsByTagName('a');
$allProperties[] = [
'names' => $name];
}
}
感谢您的帮助!
答案 0 :(得分:0)
您可以依靠XPath查询来提取所需的所有文本节点,然后在循环中获取nodeValue
属性:
$start_id = "1501";
$url = "https://www.31sumai.com/mfr/K$start_id/outline.html";
$html = file_get_contents($url);
libxml_use_internal_errors(true);
$DOMParser = new \DOMDocument();
$DOMParser->loadHTML($html);
$xpath = new \DOMXPath($DOMParser);
$classname="c-name";
$nodes = $xpath->query("//*[contains(@class, '$classname')]/a/text()");
foreach($nodes as $node){
echo $node->nodeValue;
}