我目前正在为大学编写一个基本的天气应用程序,其中包括从BBC天气RSS源中检索天气信息。
我已将其设置为将RSS提要输出到文件(output.xml)中,然后解析器类将用于构建树。
但是我得到The markup in the document following the root element must be well- formed.
我跑的时候出错了。
在检查下载的XML文件时,我注意到前两个节点丢失了。
以下是下载的XML:
<channel>
<atom:link href="http://open.live.bbc.co.uk/weather/feeds/en/2656397/observations.rss" rel="self" type="application/rss+xml" />
<title>BBC Weather - Observations for Bangor, United Kingdom</title>
<link>http://www.bbc.co.uk/weather/2656397</link>
<description>Latest observations for Bangor from BBC Weather, including weather, temperature and wind information</description>
<language>en</language>
<copyright>Copyright: (C) British Broadcasting Corporation, see http://www.bbc.co.uk/terms/additional_rss.shtml for more details</copyright>
<pubDate>Thu, 12 Mar 2015 05:35:08 +0000</pubDate>
<item>
<title>Thursday - 05:00 GMT: Thick Cloud, 10°C (50°F)</title>
<link>http://www.bbc.co.uk/weather/2656397</link>
<description>Temperature: 10°C (50°F), Wind Direction: South Easterly, Wind Speed: 8mph, Humidity: 90%, Pressure: 1021mb, Falling, Visibility: Very Good</description>
<pubDate>Thu, 12 Mar 2015 05:35:08 +0000</pubDate>
<guid isPermaLink="false">http://www.bbc.co.uk/weather/2656397-2015-03-12T05:35:08.000Z</guid>
<georss:point>53.22647 -4.13459</georss:point>
</item>
</channel>
</rss>
XML应该在<channel>
节点之前有两个以下节点:
<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:georss="http://www.georss.org/georss" version="2.0">
以下是我用来检索XML文件的代码:
public static void main(String[] args) throws SAXException, IOException, XPathExpressionException {
URL url = new URL("http://open.live.bbc.co.uk/weather/feeds/en/2656397/observations.rss");
URLConnection con = url.openConnection();
StringBuilder builder;
try (BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()))) {
builder = new StringBuilder();
String line;
if (!in.readLine().isEmpty()) {
line = in.readLine();
}
while ((line = in.readLine()) != null) {
builder.append(line).append("\n");
}
String input = builder.toString();
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(new File("output.xml"))));
out.write(input);
out.flush();
}
try {
WeatherParser parser = new WeatherParser();
System.out.println(parser.parse("output.xml"));
} catch (ParserConfigurationException ex) {
}
}
以下是解析XML(WeatherParser.java
)的代码:
public class WeatherParser {
public WeatherParser() throws ParserConfigurationException {
xpfactory = XPathFactory.newInstance();
path = xpfactory.newXPath();
dbfactory = DocumentBuilderFactory.newInstance();
builder = dbfactory.newDocumentBuilder();
}
public String parse(String fileName) throws SAXException, IOException, XPathExpressionException {
File f = new File(fileName);
org.w3c.dom.Document doc = builder.parse(f);
StringBuilder info = new StringBuilder();
info.append(path.evaluate("/channel/item/title", doc));
return info.toString();
}
private DocumentBuilderFactory dbfactory;
private DocumentBuilder builder;
private XPathFactory xpfactory;
private XPath path;
}
希望这是足够的信息。
答案 0 :(得分:1)
前两行丢失是因为你读了它但你没有&#34;保存&#34;它
删除它,它将起作用。
if (!in.readLine().isEmpty()) {
line = in.readLine();
}
在if
,您正在阅读第一行(<?xml....
)并且您没有保留它。
line = in.readLine();
获得第二个,但当您输入while
时,您将失去line
变量中的内容。
答案 1 :(得分:0)
首先,您不得操纵服务器发送给您的数据流。删除StringBuilder
。如果要将XML保存到磁盘,请逐字写入:
URL url = new URL("http://open.live.bbc.co.uk/weather/feeds/en/2656397/observations.rss");
URLConnection con = url.openConnection();
InputStream in = conn.getInputStream();
FileOutputStream out = new FileOutputStream("output.xml");
byte[] b = new byte[1024];
int count;
while ((count = in.read(b)) >= 0) {
out.write(b, 0, count);
}
out.flush(); out.close(); in.close();
事实上,您根本不需要将其写入磁盘。您可以直接从输入流构建XML文档。
public static Document readXml(InputStream is) throws SAXException, ParserConfigurationException, IOException {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
dbf.setIgnoringComments(false);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
return db.parse(is);
}
让您可以
public static void main (String[] args) throws java.lang.Exception
{
URL observationsUrl = new URL("http://open.live.bbc.co.uk/weather/feeds/en/2656397/observations.rss");
Document observations = readXml(observationsUrl.openConnection().getInputStream());
XPathFactory xpf = XPathFactory.newInstance();
XPath xpath = xpf.newXPath();
String title = xpath.evaluate("/rss/channel/title", observations);
System.out.println(title);
XPathExpression rssitemsExpr = xpath.compile("/rss/channel/item");
NodeList items = (NodeList)rssitemsExpr.evaluate(observations, XPathConstants.NODESET);
for (int i = 0; i < items.getLength(); i++) {
System.out.println(xpath.evaluate("./title", items.item(i)));
}
}
输出给我:
BBC Weather - Observations for Bangor, United Kingdom Thursday - 06:00 GMT: Thick Cloud, 11°C (52°F)