如何从DOMElement中提取数据并以JSON格式存储它

时间:2016-08-13 06:04:46

标签: php json domdocument domxpath

我尝试从DOMElement中提取数据,并希望将其存储在JSON format中的数组中。假设我关注HTML Code

<header id="header" class="header-color-white">
<div class="container">
    <div class="header-inner">
        <div class="branding">
            <h1 class="logo">
                <a href="index.html"><img src="images/logo@2x.png" alt="" width="25" height="26">NitsOnline</a>
            </h1>
        </div>
        <nav id="nav">
            <ul class="header-top-nav">
                <li class="has-children">
                    <a href="index.html">Home</a>
                    <ul class="sub-nav">
                        <li><a href="index.html">Homepage 1</a></li>
                        <li><a href="homepage2.html">Homepage 2</a></li>
                    </ul>
                </li>
                <li class="has-children">
                    <a href="about.html">About Us</a></li>
                    <ul class="sub-nav">
                        <li><a href="index.html">About 1</a></li>
                        <li class="has-children">
                            <a href="homepage2.html">About 2</a>
                            <ul class="sub-nav">
                            <li><a href="index.html">About 3</a></li>
                        </li>
                    </ul>
                </li>
            </ul>
        </nav>
    </div>
</div>

我将此HTML code存储到变量$htmlcode

$dom = new DOMDocument();
$dom->loadHTML($htmlcode);
$xpath = new DOMXPath($dom);

$logo['logolink'] = $xpath->find('div[class=branding] a')->href;
$logo['logourl'] = $xpath->find('div[class=branding] a img')->src;
$logo['title'] = $xpath->find('div[class=branding] a')->innertext;

foreach($xpath->find('div[class=header-top-nav] ul li') as $li){

    $pages['title'] = $li->find('a')->innertext;
    $pages['url'] = $li->find('a')->href;
    if($li->find('ul')){

        foreach($li->find('ul li' as $sub_li){

            $page['title'] = $sub_li->find('a')->innertext;
            $page['url'] = $sub_li->find('a')->href;
            $submenu[] = $page[];

        }

    }
    else{
        $pages['submenu'] = "NULL";
    }

}

$contents['logo'] = $logo[];
$contents['pages'] = $pages[];

如果有$sub_li,我希望sub sub menu再次循环,这也是存储到数组中的正确方法。我希望它成为这样的JSON格式:

{
    "logo": { "logolink": "index.html", "logourl": "images/logo@2x.png", "title": "NitsOnline"},
    "pages": [
        {"title": "Home", "url": "index.html", "submenu": [
            {"title": "Homepage 1", "url": "index.html", "submenu": "NULL"},
            {"title": "Homepage 2", "url": "homepage2.html", "submenu": "NULL"}
            ]},
        {"title": "About Us", "url": "about.html", "submenu": [
            {"title": "About 1", "url": "about1.html", "submenu": "NULL"},
            {"title": "About 1", "url": "about2.html", "submenu": [
                {"title": "About 1", "url": "about3.html", "submenu": "NULL"}
                ]}
        ]}
    ]
}

指导我实现这一目标,或建议我更好的方式。

0 个答案:

没有答案