Question

我正在尝试使用SimplePie来解析客户端的RSS源（客户端是华盛顿邮报的作者）。

阅读完文档并使用示例代码作为参考后，我能够将解析后的源解析到网站中，但现在我遇到的问题是撇号字符未被解码（'显示为“）

我尝试使用SimplePie FAQ中建议的解决方案来解决此问题： 1.验证该网站的元标记 2.使用SimplePie的handle_content_type（）函数 3.使用PHP的内置header（）函数来纠正HTTP标头

不幸的是，这些都没有解决我的问题。

以下是我用于解析RSS提要的代码：

<?php

require_once('php/autoloader.php');

$feedJB = new SimplePie();
$feedJB->set_feed_url('http://washingtontimes.dynamic.feedsportal.com/pf/637323/communities.washingtontimes.com/neighborhood/feeds/latest/status-update/');
$feedJB->init();
$feedJB->handle_content_type();

$feedRB = new SimplePie();
$feedRB->set_feed_url('http://washingtontimes.dynamic.feedsportal.com/pf/637323/communities.washingtontimes.com/neighborhood/feeds/latest/2nd-golden-era-advertising/');
$feedRB->init();
$feedRB->handle_content_type();

?>

这是页面上的输出代码：

<!-- Left -->
            <li class="left">
                <h3>Recent Posts</h3>
                <ul class="feed-list">
                    <?php foreach ($feedJB->get_items(0, 5) as $item): ?>
                    <li>
                        <strong><a href="<?php echo $item->get_permalink(); ?>"><?php echo $item->get_title(); ?></a></strong>
                        <small>Posted on <?php echo $item->get_date('j F Y'); ?></small>
                    </li>
                    <?php endforeach; ?>
                    <li><h4><a href="<?php echo $feedJB->get_permalink(); ?>">Read more articles by Jeff</a></h4></li>
                </ul>
            </li>
            <!-- /Left -->

            <!-- Right -->
            <li class="right">
                <h3>Recent Posts</h3>
                <ul class="feed-list">
                    <?php foreach ($feedRB->get_items(0, 5) as $item): ?>
                    <li>
                        <strong><a href="<?php echo $item->get_permalink(); ?>"><?php echo $item->get_title(); ?></a></strong>
                        <small>Posted on <?php echo $item->get_date('j F Y'); ?></small>
                    </li>
                    <?php endforeach; ?>
                    <li><h4><a href="<?php echo $feedRB->get_permalink(); ?>">Read more articles by Rob</a></h4></li>
                </ul>
            </li>
            <!-- /Right -->

我在我的机器（运行MAMP的Mac Pro Lion）以及我的网络服务器（运行Apache 2.2.22和PHP 5.2.17的Linux）上进行了本地测试。

您还可以通过转到以下链接暂时查看此内容： http://clients.josephmainwaring.com/statuscreative/#!columns.php

如果有人有解决字符编码问题的建议，我们将不胜感激。

Answer 1

我发现华盛顿邮报的供稿都是ISO-8859-1，即使它们包含UTF-8字符。我不使用SimplePie，但每次我提取Feed时，都会通过以下函数运行它，其中$xml是Feed的文本，$url是Feed的网址：

function feed_fix_broken ( $xml, $url ) {
  $xml = iconv('UTF-8', 'UTF-8//IGNORE', $xml );
  $broken = array ('washingtonpost.com' => 'ISO-8859-1');
  foreach ($broken as $domain => $encoding) {
    if (stristr($url, $domain)) {
      $xml = iconv( 'UTF-8', $encoding.'//TRANSLIT', $xml );
    }
  }
  return $xml;
}

尽可能将UTF-8编码实体音译为ISO-8859-1对应物。

请注意，在FeedDemon中，“Chávez”很复杂......

"Chávez" is screwy...

但我做得对。

but I've got it right

SimplePie 1.3字符编码问题

1 个答案: