php需要帮助来使用解析html元素

时间:2017-01-04 05:18:44

标签: php html arrays parsing

我需要帮助使用dom方法从html文件的元素创建嵌套数组这是代码

<?php
$html = '
    <p>text1</p>
    <ul>
        <li>list-a1</li>
        <li>list-a2</li>
        <li>list-a3</li>
    </ul>
    <p>text2</p>
    <ul>
        <li>list-b1</li>
        <li>list-b2</li>
        <li>list-b3</li>
    </ul>
    <p>text3</p>';

$doc = new DOMDocument();
$doc->loadHTML($html);
foreach ($doc->getElementsByTagName('p') as $link) {
    echo $link->nodeValue."\n", PHP_EOL;
}
foreach ($doc->getElementsByTagName('ul') as $link) {
    $books = $link->getElementsByTagName('li');
    foreach ($books as $book) {
        echo $book->nodeValue, PHP_EOL;
        // $links3[] = array( $ii=> $book->nodeValue, );
        //$ii++;
    }
}
?> 

这是 节目输出:

text1
text2
text3
list-a1
list-a2
list-a3
list-b1
list-b2
list-b3

但我需要以原始html的相同顺序获得此输出

text1
list-a1
list-a2
list-a3
text2
list-b1
list-b2
list-b3
text3

不使用preg或replace方法!!!

1 个答案:

答案 0 :(得分:2)

打印值

<?php

    $html = '
        <p>text1</p>
        <ul>
            <li>list-a1</li>
            <li>list-a2</li>
            <li>list-a3</li>
        </ul>
        <p>text2</p>
        <ul>
            <li>list-b1</li>
            <li>list-b2</li>
            <li>list-b3</li>
        </ul>
        <p>text3</p>
    ';

    $doc = new DOMDocument();
    $doc->loadHTML($html);

    foreach ($doc->getElementsByTagName('body')->item(0)->childNodes as $node) {
        if ($node->nodeType === XML_ELEMENT_NODE) {
            if($node->nodeName == 'p'){
                    echo $node->nodeValue."\n", PHP_EOL;

            }elseif($node->nodeName == 'ul'){
                $books = $node->getElementsByTagName('li');
                foreach ($books as $book) {
                    echo $book->nodeValue, PHP_EOL;
                }

            }
        }
    }

    ?>

输出

text1 
list-a1 
list-a2 
list-a3 
text2 
list-b1 
list-b2 
list-b3 
text3

以嵌套数组的形式打印

<?php

$html = '
    <p>text1</p>
    <ul>
        <li>list-a1</li>
        <li>list-a2</li>
        <li>list-a3</li>
    </ul>
    <p>text2</p>
    <ul>
        <li>list-b1</li>
        <li>list-b2</li>
        <li>list-b3</li>
    </ul>
    <p>text3</p>
';

$result = array();

$doc = new DOMDocument();
$doc->loadHTML($html);
$i = 0;
foreach ($doc->getElementsByTagName('body')->item(0)->childNodes as $node) {
    if ($node->nodeType === XML_ELEMENT_NODE) {
        if($node->nodeName == 'p'){
            $result[$node->nodeName][$i] = $node->nodeValue;

        }elseif($node->nodeName == 'ul'){
            $result[$node->nodeName][$i] = array();
            $books = $node->getElementsByTagName('li');
            foreach ($books as $book) {
                $result[$node->nodeName][$i][$book->nodeName][] = $book->nodeValue;
            }

        }
        $i++;
    }
}
var_dump($result);

?>