Question

我可以使用getElementByTagName从html文件解析标签。但我也想解析那个html文件中存在的id和classnames ......

这就是我的尝试： -

    $html = new DOMDocument();
    $html->loadHTMLFile($url); //url is the url of the site
    $data = $html->getElementById($identifier); //identifier is the id
    $value = array();

    foreach($data as $element)
    {
        $value[] = $element->nodeValue."<br />";
    }
    print_r($value);

但是当我使用getElementById时，我只是将输出作为array（）。我无法解析数据。还请你告诉我如何获取id和classname值

Answer 1

我知道一个很棒的工具php查询phpquery。

phpQuery::newDocumentFileXHTML('my-xhtml.html')->find('#hello');

您可以在这里找到examples。

或者您可以使用xpath，它也很好xpath。

Answer 2

没有必要进行foreach循环，因为只能有一个具有给定ID的元素：

$doc = new DOMDocument();
$doc->loadHTMLFile('http://stackoverflow.com/questions/15154290/parsing-the-ids-and-classnames-from-a-html-file');

$element = $doc->getElementById('question');
if (!is_null($element)) {
    echo $element->getAttribute('class');
}

从HTML文件中解析id和classnames

2 个答案: