如何从Java中获取XML文件中的信息

时间:2015-01-27 17:20:59

标签: java xml xpath

我试图使用Xpath从XML文件中获取信息。但我不能成功。我试图通过使用:

获取wiki(代码底部)内的摘要和内容信息
String xpath="/*[local-name(.)='wiki']/*[local-name(.)='summary']";

但我无法得到任何......我想也许我的xpath是错的?还是因为这个CDATA?我是新手,还有什么tipps?

<lfm status="ok">
<album>
<name>Believe</name>
<artist>Cher</artist>
<id>2026126</id>
<mbid>61bf0388-b8a9-48f4-81d1-7eb02706dfb0</mbid>
<url>http://www.last.fm/music/Cher/Believe</url>
<releasedate>5 Jul 2005, 00:00</releasedate>
<image size="small">http://userserve-ak.last.fm/serve/34s/88057565.png</image>
<image size="medium">http://userserve-ak.last.fm/serve/64s/88057565.png</image>
<image size="large">
http://userserve-ak.last.fm/serve/174s/88057565.png
</image>
<image size="extralarge">
http://userserve-ak.last.fm/serve/300x300/88057565.png
</image>
<image size="mega">
http://userserve-ak.last.fm/serve/_/88057565/Believe.png
</image>
<listeners>259410</listeners>
<playcount>1501557</playcount>
<tracks>
<track rank="1">
<name>Believe</name>
<duration>239</duration>
<mbid>403ceb02-581b-4c36-8814-6f2a29a3d213</mbid>
<url>http://www.last.fm/music/Cher/_/Believe</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="2">
<name>The Power</name>
<duration>233</duration>
<mbid>6b3de6b5-db70-49c9-b58d-e132a3eb1a36</mbid>
<url>http://www.last.fm/music/Cher/_/The+Power</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="3">
<name>Runaway</name>
<duration>286</duration>
<mbid>379f760d-1f29-4317-ab04-06a8218a874d</mbid>
<url>http://www.last.fm/music/Cher/_/Runaway</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="4">
<name>All or Nothing</name>
<duration>238</duration>
<mbid>a88735e6-b35c-4379-8ef7-bbd2b793ccf4</mbid>
<url>http://www.last.fm/music/Cher/_/All+or+Nothing</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="5">
<name>Strong Enough</name>
<duration>220</duration>
<mbid>26107af6-7dda-4844-85a5-8d61f24f4fc2</mbid>
<url>http://www.last.fm/music/Cher/_/Strong+Enough</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="6">
<name>Dov'è L'amore</name>
<duration>258</duration>
<mbid>58153307-25dd-4ff6-87f0-e08777e34539</mbid>
<url>
http://www.last.fm/music/Cher/_/Dov%27%C3%A8+L%27amore
</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="7">
<name>Takin' Back My Heart</name>
<duration>272</duration>
<mbid>07a38e80-ba81-494a-a61a-e8d81a40413e</mbid>
<url>
http://www.last.fm/music/Cher/_/Takin%27+Back+My+Heart
</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="8">
<name>Taxi Taxi</name>
<duration>304</duration>
<mbid>66f526c9-b135-4458-86cf-77065ce8f0aa</mbid>
<url>http://www.last.fm/music/Cher/_/Taxi+Taxi</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="9">
<name>Love Is the Groove</name>
<duration>271</duration>
<mbid>832f8f9a-95e4-476b-b108-14dec1dc84ba</mbid>
<url>http://www.last.fm/music/Cher/_/Love+Is+the+Groove</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
<track rank="10">
<name>We All Sleep Alone</name>
<duration>236</duration>
<mbid>2286a77a-644a-4c86-9d43-31c029c3625b</mbid>
<url>http://www.last.fm/music/Cher/_/We+All+Sleep+Alone</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
</track>
</tracks>
<toptags>
<tag>
<name>sourabh</name>
<url>http://www.last.fm/tag/sourabh</url>
</tag>
<tag>
<name>albums</name>
<url>http://www.last.fm/tag/albums</url>
</tag>
<tag>
<name>pop</name>
<url>http://www.last.fm/tag/pop</url>
</tag>
<tag>
<name>90s</name>
<url>http://www.last.fm/tag/90s</url>
</tag>
<tag>
<name>dance</name>
<url>http://www.last.fm/tag/dance</url>
</tag>
</toptags>
<wiki>
<published>Sat, 6 Mar 2010 16:48:03 +0000</published>
<summary>
<![CDATA[
Believe is the twenty-third studio album by American singer-actress Cher, released on November 10, 1998 by Warner Bros. Records. The RIAA certified it Quadruple Platinum on December 23, 1999, recognizing four million shipments in the United States; Worldwide, the album has sold more than 20 million copies, making it the biggest-selling album of her career. In 1999 the album received three Grammy Awards nominations including &quot;Record of the Year&quot;, &quot;Best Pop Album&quot; and winning &quot;Best Dance Recording&quot; for the single &quot;Believe&quot;.
]]>
</summary>
<content>
<![CDATA[
Believe is the twenty-third studio album by American singer-actress Cher, released on November 10, 1998 by Warner Bros. Records. The RIAA certified it Quadruple Platinum on December 23, 1999, recognizing four million shipments in the United States; Worldwide, the album has sold more than 20 million copies, making it the biggest-selling album of her career. In 1999 the album received three Grammy Awards nominations including &quot;Record of the Year&quot;, &quot;Best Pop Album&quot; and winning &quot;Best Dance Recording&quot; for the single &quot;Believe&quot;.

 It was released by Warner Bros. Records at the end of 1998. The album was executive produced by Rob Dickens. Upon its debut, critical reception was generally positive. Believe became Cher's most commercially-successful release, reached number one and Top 10 all over the world. In the United States, the album was released on November 10, 1998, and reached number four on the Billboard 200 chart, where it was certified four times platinum.

 The album featured a change in Cher's music; in addition, Believe presented a vocally stronger Cher and a massive use of vocoder and Auto-Tune. In 1999, the album received 3 Grammy Awards nominations for &quot;Record of the Year&quot;, &quot;Best Pop Album&quot; and winning &quot;Best Dance Recording&quot;. Throughout 1999 and into 2000 Cher was nominated and winning many awards for the album including a Billboard Music Award for &quot;Female Vocalist of the Year&quot;, Lifelong Contribution Awards and a Star on the Walk of Fame shared with former Sonny Bono. The boost in Cher's popularity led to a very successful Do You Believe? Tour.

 The album was dedicated to Sonny Bono, Cher's former husband who died earlier that year from a skiing accident.

 Cher also recorded a cover version of &quot;Love Is in the Air&quot; during early sessions for this album. Although never officially released, the song has leaked on file sharing networks.

 Singles


 &quot;Believe&quot;
 &quot;Strong Enough&quot;
 &quot;All or Nothing&quot;
 &quot;Dov'è L'Amore&quot; User-contributed text is available under the Creative Commons By-SA License and may also be available under the GNU FDL.
]]>
</content>
</wiki>
</album>
</lfm>

和Java部分:

String urlToRead =" http://ws.audioscrobbler.com/2.0/?method=album.getinfo&api_key=1b76cd3eaf8349f06fb4e0a9e06e0760&artist=Cher&album=Believe";
URL url;
HttpURLConnection conn;
BufferedReader rd;
String line;
String result = "";
try {
    url = new URL(urlToRead);
    conn = (HttpURLConnection) url.openConnection();
    conn.setRequestMethod("GET");
    rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));

    while ((line = rd.readLine()) != null) {

        result += line;
    }
    rd.close();
} catch (Exception e) {
    e.printStackTrace();
}
String  out = result;
SAXReader reader = new SAXReader(false);
reader.setIncludeInternalDTDDeclarations(false);
reader.setIncludeExternalDTDDeclarations(false);
String xpath="/*[local-name(.)='wiki']/*[local-name(.)='summary']";
Document document = null;
try {
    document = reader.read(new StringReader(out));
} catch (DocumentException e) {
    e.printStackTrace();
}
List nodelist = document.selectNodes(xpath);

ArrayList outputList = new ArrayList();
ArrayList outputXmlList = new ArrayList();

String val = null;
String xmlVal = null;
for (Iterator iter = nodelist.iterator(); iter.hasNext();) {
    Node element = (Node) iter.next();
    xmlVal = element.asXML();
    val = element.getStringValue();
    if (val != null && !val.equals("")) {
        outputList.add(val);
        outputXmlList.add(xmlVal);

    }

}
System.out.println(outputList.get(0));

1 个答案:

答案 0 :(得分:2)

您在问题中提供的XPath:

/*[local-name(.)='wiki']/*[local-name(.)='summary']

是来自文档根节点的绝对路径,因此为了匹配它,需要wiki元素作为文档的根元素,{{1作为它的直接孩子。这与您提供的XML不匹配,后者的根summary包含子lfm,其中包含album元素。

鉴于您的示例XML不涉及任何名称空间,您可以省略wiki技巧,只需使用类似

的路径
local-name