我有这个xml文件
http://www.metacafe.com/tags/cats/rss.xml
使用此代码:
$xml = simplexml_load_file('http://www.metacafe.com/tags/cats/rss.xml', 'SimpleXMLElement', LIBXML_NOCDATA);
echo $xml->channel->item->title . "<br>";
echo $xml->channel->item->description . "<br>";
我得到了这个输出:
Dad Challenges Kids to Climb Walls to Get Candy<br>
<a href="http://www.metacafe.com/watch/cb-M0fIp1ctKtsn/dad_challenges_kids_to_climb_walls_to_get_candy/"><img src="http://s3.mcstatic.com/thumb/11150410/28824820/4/directors_cut/0/1/dad_challenges_kids_to_climb_walls_to_get_candy.jpg?v=1" align="right" border="0" alt="Dad Challenges Kids to Climb Walls to Get Candy" vspace="4" hspace="4" width="134" height="78" /></a>
<p>
Nick Dietz compiles some of the week's best viral videos,
including an elephant trying really hard to break a stick, a cat
sunbathing and kids climbing up the walls to get candy. Plus,
making music with a Ford Fiesta.
<br>Ranked <strong>4.00</strong> / 5 | 78 views | <a href="http://www.metacafe.com/watch/cb-M0fIp1ctKtsn/dad_challenges_kids_to_climb_walls_to_get_candy/">0 comments</a><br/>
</p>
<p>
<a href="http://www.metacafe.com/watch/cb-M0fIp1ctKtsn/dad_challenges_kids_to_climb_walls_to_get_candy/"><strong>Click here to watch the video</strong></a> (02:38)<br/>
Submitted By: <a href="http://www.metacafe.com/channels/CBS/">CBS</a><br/>
Tags:
<a href="http://www.metacafe.com/topics/penna/">Penna</a>
<a href="http://www.metacafe.com/topics/bjbj/">Bjbj</a>
<a href="http://www.metacafe.com/topics/ciao/">Ciao</a> <br/>
Categories: <a href='http://www.metacafe.com/videos/entertainment/'>Entertainment</a>
</p>
<br>
我需要获取此输出(而不是需要删除所有其他元素):
Dad Challenges Kids to Climb Walls to Get Candy
Nick Dietz compiles some of the week's best viral videos,
including an elephant trying really hard to break a stick, a cat
sunbathing and kids climbing up the walls to get candy. Plus,
making music with a Ford Fiesta.
我不知道如何继续得到这个结果。
答案 0 :(得分:1)
您在说明中获取元素的原因是CDATA部分。对于XML-Parser,CDATA会话的内容始终是文本。像<p>
这样的元素不会被读入DOM结构。
简单strip_tags()
将删除所有元素。要获得更多控制,您需要将html片段加载到DOM中:
$html = <<<'HTML'
<a href="http://www.metacafe.com/watch/cb-M0fIp1ctKtsn/dad_challenges_kids_to_climb_walls_to_get_candy/"><img src="http://s3.mcstatic.com/thumb/11150410/28824820/4/directors_cut/0/1/dad_challenges_kids_to_climb_walls_to_get_candy.jpg?v=1" align="right" border="0" alt="Dad Challenges Kids to Climb Walls to Get Candy" vspace="4" hspace="4" width="134" height="78" /></a>
<p>
Nick Dietz compiles some of the week's best viral videos,
including an elephant trying really hard to break a stick, a cat
sunbathing and kids climbing up the walls to get candy. Plus,
making music with a Ford Fiesta.
<br>Ranked <strong>4.00</strong> / 5 | 78 views | <a href="http://www.metacafe.com/watch/cb-M0fIp1ctKtsn/dad_challenges_kids_to_climb_walls_to_get_candy/">0 comments</a><br/>
</p>
<p>
<a href="http://www.metacafe.com/watch/cb-M0fIp1ctKtsn/dad_challenges_kids_to_climb_walls_to_get_candy/"><strong>Click here to watch the video</strong></a> (02:38)<br/>
Submitted By: <a href="http://www.metacafe.com/channels/CBS/">CBS</a><br/>
Tags:
<a href="http://www.metacafe.com/topics/penna/">Penna</a> <br/>
Categories: <a href='http://www.metacafe.com/videos/entertainment/'>Entertainment</a>
</p>
<br>
HTML;
$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXPath($dom);
$content = $xpath->evaluate("string(//p[1]/text())");
var_dump($content);
//p/text()[1]
是p中的第一个文本节点。 string()函数将其转换为字符串。如果节点不存在,表达式将返回一个空字符串。