我不知道我做错了什么。每次它通过它只是保持循环并拉动同一行中列出的所有城市并将它们置于状态,然后当它进入下一个状态时它从正确的位置开始,但仍然继续。我已经尝试了4个小时,我无法理解。
$url = 'http://www.craigslist.org/about/sites';
$output = file_get_contents($url);
$doc = new DOMDocument();
libxml_use_internal_errors(true); //Supress Warnings for HTML5 conversion issue
$doc->loadHTML($output);
libxml_use_internal_errors(false); //Start Showing Errors
$xpath = new DOMXpath($doc);
foreach ($xpath->query('//h1') as $e) {
$country = $e->nodeValue;
$list = array();
foreach ($xpath->query('./following-sibling::div[@class="colmask"]', $e) as $li) {
foreach ($xpath->query('//div/h4', $e) as $div) {
$state = $div->nodeValue;
foreach ($xpath->query('./following-sibling::ul/li', $div) as $div2) {
$href = $div2->getAttribute("href");
$text = trim(preg_replace("/[\r\n]+/", " ", $div2->nodeValue));
echo 'Country: ' . $country . ' State: ' . $state . ' CITY: text['. $text . '] href[' . $href . '] <br/><br/><br/>';
}
}
}
}
答案 0 :(得分:1)
您应该避免在执行此操作时嵌套query
。使用item
方法,而不是使用foreach ($xpath->query('./following-sibling::div[@class="colmask"]', $e) as $li) {
foreach ($xpath->query('//div/h4', $e) as $div) {
$state = $div->nodeValue;
方法在每次迭代时获得的DOMNodeList。
例如,而不是写:
$result = $xpath->query('./following-sibling::div[@class="colmask"]', $e);
$state = $result->item(0)->nodeValue;
写:
$state
如果您需要从DOMNode $state->parentNode
导航,请使用$state->nextSibling
,$state->previousSibling
和/或{{1}}
答案 1 :(得分:0)
有人叫DuffyDake回答我的问题。这是答案..
foreach ($xpath->query('./following-sibling::ul[1]/li', $div) as $div2) {
$href = $div2->getAttribute("href");
$text = trim(preg_replace("/[\r\n]+/", " ", $div2->nodeValue));
echo 'Country: ' . $country . ' State: ' . $state . ' CITY: text['. $text . '] href[' . $href . '] <br/><br/><br/>';
}
缺少的部分是[1]引用找到的第一个UL,而不是过去的那个