我使用Simple HTML DOM抓取页面以获取最新消息,然后使用此PHP class生成RSS Feed。
这就是我现在所拥有的:
<?php
// This is a minimum example of using the class
include("FeedWriter.php");
include('simple_html_dom.php');
$html = file_get_html('http://www.website.com');
foreach($html->find('td[width="380"] p table') as $article) {
$item['title'] = $article->find('span.title', 0)->innertext;
$item['description'] = $article->find('.ingress', 0)->innertext;
$item['link'] = $article->find('.lesMer', 0)->href;
$item['pubDate'] = $article->find('span.presseDato', 0)->plaintext;
$articles[] = $item;
}
//Creating an instance of FeedWriter class.
$TestFeed = new FeedWriter(RSS2);
//Use wrapper functions for common channel elements
$TestFeed->setTitle('Testing & Checking the RSS writer class');
$TestFeed->setLink('http://www.ajaxray.com/projects/rss');
$TestFeed->setDescription('This is test of creating a RSS 2.0 feed Universal Feed Writer');
//Image title and link must match with the 'title' and 'link' channel elements for valid RSS 2.0
$TestFeed->setImage('Testing the RSS writer class','http://www.ajaxray.com/projects/rss','http://www.rightbrainsolution.com/images/logo.gif');
foreach($articles as $row) {
//Create an empty FeedItem
$newItem = $TestFeed->createNewItem();
//Add elements to the feed item
$newItem->setTitle($row['title']);
$newItem->setLink($row['link']);
$newItem->setDate($row['pubDate']);
$newItem->setDescription($row['description']);
//Now add the feed item
$TestFeed->addItem($newItem);
}
//OK. Everything is done. Now genarate the feed.
$TestFeed->genarateFeed();
?>
如何使此代码更简单? 对,知道有两个foreach语句,我如何组合它们?
因为新闻报道的是挪威语,我需要在标题上应用html_entity_decode()。我在这里尝试过,但我无法让它工作:
foreach($html->find('td[width="380"] p table') as $article) {
$item['title'] = html_entity_decode($article->find('span.title', 0)->innertext, ENT_NOQUOTES, 'UTF-8');
$item['description'] = "<img src='" . $article->find('img[width="100"]', 0)->src . "'><p>" . $article->find('.ingress', 0)->innertext . "</p>";
$item['link'] = $article->find('.lesMer', 0)->href;
$item['pubDate'] = unix2rssdate(strtotime($article->find('span.presseDato', 0)->plaintext));
$articles[] = $item;
}
谢谢:)
答案 0 :(得分:4)
您似乎循环遍历$html
以构建一系列文章,然后循环浏览这些添加到Feed中 - 您可以通过在找到Feed时向其中添加项目来跳过整个循环。为此,您需要在执行流程中稍稍移动FeedWriter
contstructor。
我还添加了几种方法来帮助提高可读性,从长远来看,这可能有助于维护。如果您需要为Feed添加不同的提供程序类,更改解析规则等,那么封装您的Feed创建,项目修改等应该会更容易。可以对以下代码进行进一步的改进({{1}在html_entity_decode
任务等单独的一行上,你可以得到一般的想法。
您与$item['title']
有什么问题?你有一个输入/输出样本吗?
html_entity_decode
答案 1 :(得分:3)
对于两个循环的简单组合,您可以通过HTML创建Feed作为解析:
<?php
include("FeedWriter.php");
include('simple_html_dom.php');
$html = file_get_html('http://www.website.com');
//Creating an instance of FeedWriter class.
$TestFeed = new FeedWriter(RSS2);
$TestFeed->setTitle('Testing & Checking the RSS writer class');
$TestFeed->setLink('http://www.ajaxray.com/projects/rss');
$TestFeed->setDescription(
'This is test of creating a RSS 2.0 feed Universal Feed Writer');
$TestFeed->setImage('Testing the RSS writer class',
'http://www.ajaxray.com/projects/rss',
'http://www.rightbrainsolution.com/images/logo.gif');
//parse through the HTML and build up the RSS feed as we go along
foreach($html->find('td[width="380"] p table') as $article) {
//Create an empty FeedItem
$newItem = $TestFeed->createNewItem();
//Look up and add elements to the feed item
$newItem->setTitle($article->find('span.title', 0)->innertext);
$newItem->setDescription($article->find('.ingress', 0)->innertext);
$newItem->setLink($article->find('.lesMer', 0)->href);
$newItem->setDate($article->find('span.presseDato', 0)->plaintext);
//Now add the feed item
$TestFeed->addItem($newItem);
}
$TestFeed->genarateFeed();
?>
您使用html_entity_decode
看到的问题是什么,如果您向我们提供指向不起作用的网页的链接可能会有所帮助?
答案 2 :(得分:0)
如何让这段代码更简单?
我知道这不是你要问的,但你知道[http://pipes.yahoo.com/pipes/](Yahoo!管道)?
答案 3 :(得分:0)
也许你可以使用像Feedity这样的东西 - http://feedity.com已经解决了从任何网页生成RSS Feed的问题。