Question

我想用PHP解析Google News rss。我设法运行这段代码：

<?
$news = simplexml_load_file('http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=n&output=rss');

foreach($news->channel->item as $item) {
    echo "<strong>" . $item->title . "</strong><br />";
    echo strip_tags($item->description) ."<br /><br />";
}
?>

但是，我无法解决以下问题。例如：

如何获取新闻标题的超链接？
由于每个Google新闻都在页脚中有许多相关的新闻链接，（上面我的代码也包括它们）。如何从描述中删除它们？
我如何才能获得每条新闻的图像？（Google显示每条新闻的缩略图）

感谢。

Answer 1

我们去了，只是你需要的特殊情况：

<?php
$news = simplexml_load_file('http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=n&output=rss');

$feeds = array();

$i = 0;

foreach ($news->channel->item as $item) 
{
    preg_match('@src="([^"]+)"@', $item->description, $match);
    $parts = explode('<font size="-1">', $item->description);

    $feeds[$i]['title'] = (string) $item->title;
    $feeds[$i]['link'] = (string) $item->link;
    $feeds[$i]['image'] = $match[1];
    $feeds[$i]['site_title'] = strip_tags($parts[1]);
    $feeds[$i]['story'] = strip_tags($parts[2]);

    $i++;
}

echo '<pre>';
print_r($feeds);
echo '</pre>';
?>

输出应如下所示：

[2] => Array
        (
            [title] => Los Alamos Nuclear Lab Under Siege From Wildfire - ABC News
            [link] => http://news.google.com/news/url?sa=t&fd=R&usg=AFQjCNGxBe4YsZArH0kSwEjq_zDm_h-N4A&url=http://abcnews.go.com/Technology/wireStory?id%3D13951623
            [image] => http://nt2.ggpht.com/news/tbn/OhH43xORRwiW1M/6.jpg
            [site_title] => ABC News
            [story] => A wildfire burning near the desert birthplace of the atomic bomb advanced on the Los Alamos laboratory and thousands of outdoor drums of plutonium-contaminated waste Tuesday as authorities stepped up ...
        )

Answer 2

我建议您查看SimplePie。我已经将它用于几个不同的项目，并且效果很好（并且抽象出了你目前正在处理的所有头痛）。

现在，如果您只是因为想要学习如何编写此代码，那么您应该忽略这个答案。：）

Answer 3

要获取新闻条目的网址，请使用$ item-＆gt;链接。
如果相关新闻链接有一个共同的分隔符，您可以使用正则表达式来删除它之后的所有内容。
Google将缩略图图片HTML代码放在Feed的说明字段中。您可以在图像声明的开括号和闭括号之间取消所有内容，以获取HTML。

使用PHP解析Google新闻RSS

3 个答案: