我想阅读给定网址的内容,但不是HTML格式的XML格式。
示例RSS响应如下所示: 我想做的就是检索这样的样本而不是html数据
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<rss version="2.0" xmlns:yweather="http://xml.weather.yahoo.com/ns/rss/1.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">
<channel>
<title>Yahoo! Weather - Sunnyvale, CA</title>
<link>http://us.rd.yahoo.com/dailynews/rss/weather/Sunnyvale__CA/*http://weather.yahoo.com/forecast/USCA1116_f.html</link>
<description>Yahoo! Weather for Sunnyvale, CA</description>
<language>en-us</language>
<lastBuildDate>Fri, 18 Dec 2009 9:38 am PST</lastBuildDate>
<ttl>60</ttl>
<yweather:location city="Sunnyvale" region="CA" country="United States"/>
<yweather:units temperature="F" distance="mi" pressure="in" speed="mph"/>
<yweather:wind chill="50" direction="0" speed="0" />
<yweather:atmosphere humidity="94" visibility="3" pressure="30.27" rising="1" />
<yweather:astronomy sunrise="7:17 am" sunset="4:52 pm"/>
<image>
<title>Yahoo! Weather</title>
<width>142</width>
<height>18</height>
<link>http://weather.yahoo.com</link>
<url>http://l.yimg.com/a/i/us/nws/th/main_142b.gif</url>
</image>
<item>
<title>Conditions for Sunnyvale, CA at 9:38 am PST</title>
<geo:lat>37.37</geo:lat>
<geo:long>-122.04</geo:long>
<link>http://us.rd.yahoo.com/dailynews/rss/weather/Sunnyvale__CA/*http://weather.yahoo.com/forecast/USCA1116_f.html</link>
<pubDate>Fri, 18 Dec 2009 9:38 am PST</pubDate>
<yweather:condition text="Mostly Cloudy" code="28" temp="50" date="Fri, 18 Dec 2009 9:38 am PST" />
<description><![CDATA[
<img src="http://l.yimg.com/a/i/us/we/52/28.gif"/><br />
<b>Current Conditions:</b><br />
Mostly Cloudy, 50 F<BR />
<BR /><b>Forecast:</b><BR />
Fri - Partly Cloudy. High: 62 Low: 49<br />
Sat - Partly Cloudy. High: 65 Low: 49<br />
<br />
<a href="http://us.rd.yahoo.com/dailynews/rss/weather/Sunnyvale__CA/*http://weather.yahoo.com/forecast/USCA1116_f.html">Full Forecast at Yahoo! Weather</a><BR/><BR/>
(provided by <a href="http://www.weather.com" >The Weather Channel</a>)<br/>
]]></description>
<yweather:forecast day="Fri" date="18 Dec 2009" low="49" high="62" text="Partly Cloudy" code="30" />
<yweather:forecast day="Sat" date="19 Dec 2009" low="49" high="65" text="Partly Cloudy" code="30" />
<guid isPermaLink="false">USCA1116_2009_12_18_9_38_PST</guid>
</item>
</channel>
</rss>
我使用了这段代码,但收到了错误:
package search;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
public class Process{
public static void main ( String [] args ) throws IOException{
URL xmlUrl = new URL("http://www.yahoo.com");
InputStream in = xmlUrl.openStream();
Document doc = parse(in);
}
public static Document parse (InputStream is) {
Document ret = null;
DocumentBuilderFactory domFactory;
DocumentBuilder builder;
try {
domFactory = DocumentBuilderFactory.newInstance();
domFactory.setValidating(false);
domFactory.setNamespaceAware(false);
builder = domFactory.newDocumentBuilder();
ret = builder.parse(is);
}
catch (Exception ex) {
System.err.println("unable to load XML: " + ex);
}
return ret;
}
}
错误:
[Fatal Error] :7:17: The entity "lrm" was referenced, but not declared.
unable to load XML: org.xml.sax.SAXParseException; lineNumber: 7; columnNumber: 17; The entity "lrm" was referenced, but not declared.
答案 0 :(得分:1)
对您的URL发出HTTP请求并解析返回的String。我错过了这一点吗?我没有得到你的问题...