我试图从html获取子节点的值。
我尝试了什么: -
$data = $dom->getElementById($identifier);
$node = $data->childNodes;
foreach($node as $node)
{
echo $node->nodeName;
echo $node->nodeValue;
}
}
即使childtag没有值,我也可以获取所有的childnodes值 我得到一个空字符串。
如果标签具有正确的值,有没有办法获取值???
更新: -
我的htmldata: -
<div id="myid"> //I give this id as input
<h1> Some data 1</h1>
<script type=".."> google adsense details </script>
<p class="some class"> </p>
<div class="some class1"></div>
<h2>data2</h2>
<p>SOme more data...blah blah..</p>
</div>
我想要什么输出: -
Some data 1
data2
SOme more data...blah blah..
我得到的是什么: -
Some data 1
googleadsense details//i am getting values inside script as well
//blanc data which includes many spaces of tag p
//blanc data which includes many spaces of tag div
data2
SOme more data...blah blah..
答案 0 :(得分:1)
经过一些测试后,这应该适用于您要完成的任务:
→第一次编辑:此解决方案考虑了多个子节点来遍历内部标识符。
→第二次编辑:此解决方案负责指定您不想返回的标签/值。
→第3次编辑:提取原始问题的详细信息,与更新后的问题无关。
$dom = new DOMDocument();
$html = '<div id="myid"> //I give this id as input<h1> Some data 1</h1><script type=".."> google adsense details </script><p class="some class"></p><div class="some class1"></div><h2>data2</h2><p>SOme more data...blah blah..</p></div>';
$dom->loadHTML( $html );
$identifier = "myid";
$id_nodes = $dom->getElementById( $identifier );
foreach( $id_nodes->childNodes as $node )
{
// Blacklist for what you do not want in your output:
if( $node->nodeName != "script" && $node->nodeName != "#text" && $node->nodeValue != '' ) {
echo $node->nodeValue . "<br />";
}
}
上述脚本的输出是:
Some data 1
data2
SOme more data...blah blah..
答案 1 :(得分:0)
检查echo之前的值?
foreach($data as $node)
{
if(strlen($node->nodeValue) > 0)
{
echo $node->nodeName;
echo $node->nodeValue;
}
}