PHP致命错误:无法将字符串解析为XML

时间:2011-09-29 13:35:01

标签: php xml curl rss

我正在尝试在我的网站上添加RSS Feed。以下代码在本地工作但在实时站点上导致致命错误:

<?php
// Initialise the cURL resource handle:
$ch = curl_init("http://www.blogs.stopjunkmail.org.uk/diary/index.php?/feeds/index.rss2");
// Set connection options:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
// Execute connection, wait for response, and close:
$data = curl_exec($ch);
curl_close($ch);
// Parse the data:
$doc = new SimpleXmlElement($data, LIBXML_NOCDATA);
// Define the function to parse RSS:
function parseRSS($doc) {
    echo '<ul>' . "\n";
    for($i=0; $i<5; $i++) {
        $url    = $doc->channel->item[$i]->link;
        $title  = $doc->channel->item[$i]->title;
        $date   = $doc->channel->item[$i]->pubDate;
        echo '<li>' . "\n";
        echo '<a href="'.$url.'">'.$title.'</a>' . "\n";
        echo '</li>' . "\n";
    }
    echo '</ul>' . "\n";
}
?>
<!doctype html>
<html lang="en-GB">
<head>
 <meta charset="UTF-8" />
 <title>Test feed</title>
</head>
<body>

 <h2>Recent blog entries</h2>
<?php parseRSS($doc); ?>

</body>
</html>

这会导致以下错误:

[Thu Sep 29 12:06:28 2011] [error] [client xx.xx.xx.xxx] PHP Fatal error:  Uncaught exception 'Exception' with message 'String could not be parsed as XML' in /home/sites/stopjunkmail.org.uk/public_html/news/_test.php:11
[Thu Sep 29 12:06:28 2011] [error] [client xx.xx.xx.xxx] Stack trace:
[Thu Sep 29 12:06:28 2011] [error] [client xx.xx.xx.xxx] #0 /home/sites/stopjunkmail.org.uk/public_html/news/_test.php(11): SimpleXMLElement->__construct('', 16384)
[Thu Sep 29 12:06:28 2011] [error] [client xx.xx.xx.xxx] #1 {main}
[Thu Sep 29 12:06:28 2011] [error] [client xx.xx.xx.xxx]   thrown in /home/sites/stopjunkmail.org.uk/public_html/news/_test.php on line 11

经过大量的反复试验并查找类似的问题,我发现这是造成问题的Feed。如果我将Feed更改为非常基本的样本感觉(例如http://feedparser.org/docs/examples/rss20.xml),那么一切正常。我正在尝试解析的提要是有效的(虽然有一些警告)。

问题是......我需要做什么让脚本接受Feed?

2 个答案:

答案 0 :(得分:1)

使用mb_convert_encoding()更改为utf8,不要忘记调用parseRSS()函数。

// Initialise the cURL resource handle:
$ch = curl_init("http://www.blogs.stopjunkmail.org.uk/diary/index.php?/feeds/index.rss2");
// Set connection options:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
// Execute connection, wait for response, and close:
$data = curl_exec($ch);
curl_close($ch);
// Parse the data:
$enc = mb_detect_encoding($data);
$data = mb_convert_encoding($data, 'UTF-8', $enc);

// Define the function to parse RSS:
function parseRSS($doc) {
    echo '<ul>' . "\n";
    for($i=0; $i<5; $i++) {
        $url    = $doc->channel->item[$i]->link;
        $title  = $doc->channel->item[$i]->title;
        $date   = $doc->channel->item[$i]->pubDate;
        echo '<li>' . "\n";
        echo '<a href="'.$url.'">'.$title.'</a>' . "\n";
        echo '</li>' . "\n";
    }
    echo '</ul>' . "\n";
}
parseRSS($doc);
?>
<!doctype html>
<html lang="en-GB">
<head>
 <meta charset="UTF-8" />
 <title>Test feed</title>
</head>
<body>

答案 1 :(得分:0)

强制MIME类型可能是另一种选择吗?

curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: text/xml'));

编辑:与您的本地环境相比,您的网络服务器上的cURL也可能存在问题。如果PHP版本不同,那么这可能是一个问题。