从rss feed中读取cdata

时间:2011-11-05 12:30:49

标签: php xml cdata

我正在使用简单的代码阅读RSS Feed:

 <?php
$homepage = file_get_contents('http://www.forbes.com/news/index.xml');
$movies = new SimpleXMLElement($homepage);
echo '<pre>';
print_r($movies);
?>

和这样的输出:            SimpleXMLElement对象             (             [@attributes] =&gt;排列         (             [版本] =&gt; 2.0         )

[channel] => SimpleXMLElement Object
    (
        [title] => SimpleXMLElement Object
            (
            )

        [link] => SimpleXMLElement Object
            (
            )

        [description] => SimpleXMLElement Object
            (
            )

        [language] => en-us
        [copyright] => Copyright 2009 Forbes.com LLC
        [item] => Array
            (
                [0] => SimpleXMLElement Object
                    (
                        [title] => SimpleXMLElement Object
                            (
                            )

                        [link] => SimpleXMLElement Object
                            (
                            )

                        [author] => SimpleXMLElement Object
                            (
                            )

                        [pubDate] => Sat, 05 Nov 2011 07:17:21 GMT
                        [description] => SimpleXMLElement Object
                            (
                            )

                    )

以及......  但当我查看此页面的来源时,我有这样的信息:

 <rss version="2.0"><channel><title><![CDATA[Forbes.com: News]]></title><link><!   [CDATA[http://www.forbes.com]]></link><description><![CDATA[News and reports from Forbes.com]]></description><language>en-us</language><copyright>Copyright 2009 Forbes.com LLC</copyright><item><title><![CDATA[Benicio Del Toro Offered Villain Role In "Star Trek" Sequel - Is It Khan?]]></title><link><![CDATA[http://www.forbes.com/sites/markhughes/2011/11/05/benicio-del-toro-offered-villain-role-in-star-trek-sequel-is-it-khan/?feed=rss_home]]></link><author><![CDATA[Mark Hughes]]></author><pubDate>Sat, 05 Nov 2011 07:17:21 GMT</pubDate><description><![CDATA[Variety reports that actor Benicio del Toro is being offered the role of villain in the upcoming sequel to director J.J. Abram?s 2009 blockbuster franchise-reboot movie Star Trek. So far, Abrams and crew have kept a tight lid on details about the new Paramount film, and the identity of the main villain is a closely ...]]></description>

如何在mydatabase中读取和存储CDATA值。

2 个答案:

答案 0 :(得分:10)

告诉SimpleXML将CDATA转换为普通文本:

$homepage = 'http://www.forbes.com/news/index.xml';
$movies = simplexml_load_file($homepage, "SimpleXMLElement", LIBXML_NOCDATA);

应该使用simplexml_load_file代替file_get_contents来为您完成。

相关答案:Removing cdata in simplehtmldom

答案 1 :(得分:3)

上述“修复”将起作用,但完全没必要。

SimpleXML对象包含许多“魔法”,并不是设计为使用print_r查看; CDATA安全地放在你的对象中,但除非你以正确的方式提出要求,否则不会出现。

如果您运行echo (string)$movies->channel->title;,您应该按照预期获得“Forbes.com:News”。

请注意(string),它告诉PHP将“魔术”SimpleXMLElement显式转换为字符串。如果你不这样做,你实际上会得到另一个SimpleXMLElement对象 - 否则我的例子将无效,因为$ movies-&gt; channel将是一个字符串。

当从SimpleXML 访问元素或属性时,始终使用(字符串)是一个好习惯,因为如果某些函数需要一个字符串并且你给它们一个SimpleXML对象,并且序列化或者会话存储肯定会失败。