从网页/博客中提取Atom

时间:2014-01-28 09:40:25

标签: java atom-feed

我正在尝试从网页解析原子提要。但是第三行显示错误,当我试图解决这个问题时“它显示了一个选项:”配置构建路径“。我该如何修复它?我试图修复它但是它没有得到修复。请帮我解决这个问题

URL feedUrl = new URL("http://localhost:8080/namespace/feed/");
SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build(new XmlReader(feedUrl));
System.out.println("Feed Title: " + feed.getTitle());

这是我试过的代码

try {
URL url = new URL("https://www.google.com/search?hl=en&q=robbery&tbm=blg&
output=atom");SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build(new XmlReader(url));
System.out.println("Feed Title: " + feed.getTitle());
for (SyndEntry entry : (List<SyndEntry>) feed.getEntries())
{
System.out.println("Title: " + entry.getTitle());
System.out.println("Unique Identifier: " + entry.getUri());
System.out.println("Updated Date: " + entry.getUpdatedDate());
for (SyndLinkImpl link : (List<SyndLinkImpl>) entry.getLinks())
{
System.out.println("Link: " + link.getHref());}           
for (SyndContentImpl content : (List<SyndContentImpl>) entry.getContents())
{
System.out.println("Content: " + content.getValue());
}


for (SyndCategoryImpl category : (List<SyndCategoryImpl>) entry.getCategories())
{
System.out.println("Category: " + category.getName());
}
}//for
}//try
catch (Exception ex) 
{
}

}

1 个答案:

答案 0 :(得分:3)

我想您正在使用Rome,请确保您的类路径中包含所有罗马依赖项,并添加了构建路径所需的库。

您的代码对我有用,也许您的库已损坏,您可以从

再次下载它

http://mvnrepository.com/artifact/rome/rome/1.0

http://mvnrepository.com/artifact/jdom/jdom/1.0

(我建议使用maven来管理您的项目。)

然后再次将库添加到构建路径中,清理项目并再次运行它。

您收到HTTP 403错误代码,因为Google阻止了无法识别的HTTP客户端,您的HTTP客户端需要是Chrome,MSIE,Gecko等公认的客户端,请将User Agent设置为您的HTTP客户端并将工作。

试试这段代码:

import java.net.URL;
import java.net.URLConnection;
import java.util.List;

import com.sun.syndication.feed.synd.SyndCategoryImpl;
import com.sun.syndication.feed.synd.SyndContentImpl;
import com.sun.syndication.feed.synd.SyndEntry;
import com.sun.syndication.feed.synd.SyndFeed;
import com.sun.syndication.feed.synd.SyndLinkImpl;
import com.sun.syndication.io.SyndFeedInput;
import com.sun.syndication.io.XmlReader;

public class Rome {

    public static void main(String[] args) {
        try {
            URLConnection urlConnection = new URL("https://www.google.com/search?hl=en&q=robbery&tbm=blg&output=atom").openConnection();
            urlConnection.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");

            SyndFeedInput input = new SyndFeedInput();
            input.setPreserveWireFeed(true);
            SyndFeed feed = input.build(new XmlReader(urlConnection));
            System.out.println("Feed Title: " + feed.getTitle());
            for (SyndEntry entry : (List<SyndEntry>) feed.getEntries()) {
                System.out.println("Title: " + entry.getTitle());
                System.out.println("Unique Identifier: " + entry.getUri());
                System.out.println("Updated Date: " + entry.getUpdatedDate());
                for (SyndLinkImpl link : (List<SyndLinkImpl>) entry.getLinks()) {
                    System.out.println("Link: " + link.getHref());
                }
                for (SyndContentImpl content : (List<SyndContentImpl>) entry.getContents()) {
                    System.out.println("Content: " + content.getValue());
                }

                for (SyndCategoryImpl category : (List<SyndCategoryImpl>) entry.getCategories()) {
                    System.out.println("Category: " + category.getName());
                }
            }// for
        }// try
        catch (Exception ex) {
            System.err.println(ex);
        }

    }
}