Question

我尝试编写一个简单的PHP来抓取一个html页面。我不知道为什么我不能得到结果？这是我的一些PHP代码：

//$html , successfuly get the html from "http://m.hkgolden.com/topics.aspx?type=HW" by curl

$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$xpath->registerNamespace('x', 'http://www.w3.org/1999/xhtml');

$itemList = $xpath->query('//x:div[contains(@class,"TopicBox_Details")]/a');

var_dump($itemList); // it show --> object(DOMNodeList)#4 (0) { }

foreach ($itemList as $item){
        $this->child_urls[] = $item->getElementsByTagName('a')->item(0)->getAttribute('href');
                }

var_dump($this->child_urls); //it show --> array(0) { }

相同的xpath查询在firefox XPath Checker中工作，但同样的查询在PHP中不起作用。我做错了什么？

Answer 1

您应该将命名空间添加到XPath中的a元素 - 名称空间继承：

//x:div[contains(@class,"TopicBox_Details")]/x:a

初学者头痛PHP DOMXPath

1 个答案: