使用php从html中的嵌套标签内的每个标签中获取标签值

时间:2018-04-02 14:23:04

标签: php html

我在代码中工作,获取所有标签值"仅文字"来自html文件。但是如果任何标签有嵌套标签,它将进入Childs并获得不具有孩子的标签值。 我试过这个,但它有点遗失

php代码:

$dochtml = new DOMDocument();
$dochtml->loadHTMLFile("index2.html");
$nodes = $dochtml ->getElementsByTagName("a"); 
gettagsvalue($nodes);
  function gettagsvalue($nodes){
    if($nodes->length != 0){
      for ($i=0;$i<$nodes->length;$i++){
        foreach ($tags=["h1","h2","h3","h4","h5","h6","h7","a","img","li","span","p","pre","i","strong","div","ul"] as $tag){  
          if($nodes->item($i)->getElementsByTagName($tag)->length != 0){
            if ($nodes->item($i)->getElementsByTagName($tag)->length == 1){
              echo "here"."<br><br><br> $tag";
              echo "<pre>" ;print_r($nodes->item($i)->getElementsByTagName($tag)->item(0));echo "</pre>" ;             
            }else{
              echo "there"."<br><br><br> $tag";
              gettagsvalue($nodes->item($i)->getElementsByTagName($tag));
              // echo "$tag <br><br><br>";
            }
            // print_r($nodes->item($i)->getElementsByTagName($tag));echo "<br>"; 
          }        
      }
    }
  }
}

我希望得到

&#34;绿色&#34; &#34;谷&#34;

HTML:

<a href="index.html" id="aaaaaaaaaaaa2015284957">
    <img src="images/logo.png" width="50px" height="50px" id="imgaaaaaaaaaaimg732756221">
    <span>Green</span>
    <span id="spanaaaaaaaaaaspan1106733773">Valley</span>
</a>

1 个答案:

答案 0 :(得分:0)

你考虑过使用textContent属性吗?这应该连接所有嵌套节点的文本节点。 有关详细信息,请参阅php domdocument read element inner textPHP DOM textContent vs nodeValue?