Question

我有以下源代码：

<?php

    function getTerms()
    {
        $doc = new DOMDocument();
        libxml_use_internal_errors(true);
        $doc->loadHTML('https://charitablebookings.com/terms'); // loads your HTML
        $xpath = new DOMXPath($doc);
        // returns a list of all links with rel=nofollow
        $nodeList = $xpath->query("//div[@class='terms-conditions']");
        $temp_dom = new DOMDocument();
        $node = $nodeList->item(0);         
        $temp_dom = new DOMDocument();
        foreach($nodeList as $n) $temp_dom->appendChild($temp_dom->importNode($n,true));
        print_r($temp_dom->saveHTML());         

    }


    getTerms();
?>

我正试图通过获取特定的类来从网页获取文本。尝试对temp_dom进行print_r时，浏览器没有任何显示。 $ node为空。我在做什么错了？

感谢您的时间

Answer 1

第一个问题是DOMDocument的{{1}}方法期望HTML内容是其第一个参数，而不是URL。

loadHTML

第二个问题是您的XPath表达式：$doc = new DOMDocument(); libxml_use_internal_errors(true); $html = file_get_contents('https://charitablebookings.com/terms'); $doc->loadHTML($html);-因为在文档中没有$xpath->query("//div[@class='terms-conditions']")中div的{{1}}，class JavaScript加载程序）。

nodeList的print_r无法正常工作

1 个答案: