从DOMDocument中的ul / li菜单元素中提取数据

时间:2018-05-28 18:37:44

标签: php

我有一个菜单项列表ol / li,我想从中提取数据:来自指定类ol(items-3)的TITLE,URL,TAGS和DESC。我写了一个不起作用的代码,我无法处理它,你有一些我做错的提示?

菜单:

<ol class="items-1">/*---*/</ol>
<ol class="items-2">/*---*/</ol>
<ol class="items-3">
  <li>
    <div class="title">[TITLE]</div>
    <a href="[URL]">
      <span class="tags">[TAGS]</span>
      <span class="desc">[DESC]</span>
      /*---*/
    </a>
  </li>
  <li>
    <div class="title">[TITLE]</div>
    <a href="[URL]">
      <span class="tags">[TAGS]</span>
      <span class="desc">[DESC]</span>
      /*---*/
    </a>
  </li>
  <li>
    <div class="title">[TITLE]</div>
    <a href="[URL]">
      <span class="tags">[TAGS]</span>
      <span class="desc">[DESC]</span>
      /*---*/
    </a>
  </li>
</ol>

脚本

<?php
$html = '<ol class="items-1">/*---*/</ol>
    <ol class="items-2">/*---*/</ol>
    <ol class="items-3">
      <li>
        <div class="title">[TITLE]</div>
        <a href="[URL]">
          <span class="tags">[TAGS]</span>
          <span class="desc">[DESC]</span>
          /*---*/
        </a>
      </li>
      <li>
        <div class="title">[TITLE]</div>
        <a href="[URL]">
          <span class="tags">[TAGS]</span>
          <span class="desc">[DESC]</span>
          /*---*/
        </a>
      </li> </ol>
';

$dom = new DOMDocument();
$dom->loadHTML($html); 
$ol = $dom->getElementsByTagName("ol")[2]; //for items-3 class
$li = $ol->getElementsByTagName("li");
foreach ($li as $element) {
    $title = $element->getElementsByTagName('div')->nodeValue;
    $url = $element->getElementsByTagName('a')->getAttribute('href');
    $tags = $element->getElementsByTagName('span')[0]->nodeValue;
    $desc = $element->getElementsByTagName('span')[1]->nodeValue;
}

?>

感谢所有帮助:)。

1 个答案:

答案 0 :(得分:2)

getElementsByTagName返回DOMNodeList。您必须告诉php您要使用哪个项目 因此,类DOMNodeList具有方法item(),它通过NodeList中的索引返回DOMNode

例如更改此

$title = $element->getElementsByTagName('div')->nodeValue;

$title = $element->getElementsByTagName('div')->item(0)->nodeValue;

更正的代码:

$dom = new DOMDocument();
$dom->loadHTML($html); 
$ol = $dom->getElementsByTagName("ol")->item(2); //for items-3 class
$li = $ol->getElementsByTagName("li");
foreach ($li as $element) {
    $title = $element->getElementsByTagName('div')->item(0)->nodeValue;
    $url = $element->getElementsByTagName('a')->item(0)->getAttribute('href');
    $tags = $element->getElementsByTagName('span')->item(0)->nodeValue;
    $desc = $element->getElementsByTagName('span')->item(1)->nodeValue;
}

工作片段:https://3v4l.org/6hcOt