我正在使用Jsoup来解析存储在filesystem中的xml文件,但是当我解析链接元素时会更改其范围......
XML文件: -
<movies>
<movie>
<id>0</id>
<name>Aag - 1948</name>
<link>http://www.songspk.pk/indian/aag_1948.html</link>
</movie>
<movie>
<id>1</id>
<name></name>
<link>#</link>
</movie>
<movie>
<id>2</id>
<name>Aa Ab Laut Chalain</name>
<link>http://www.songspk.pk/aa_ab_laut_chalein.html</link>
</movie>
<movie>
<id>3</id>
<name>Aag - RGV Ki Aag</name>
<link>http://www.songspk.pk/aag.html</link>
</movie>
</movies>
Java实现: -
public class DownloadSongsList {
private static Document document;
public static void main(String...string) throws IOException{
document = Jsoup.parse(new File("c:/movies.xml"), "UTF-8");
Elements movies = document.getElementsByTag("movies");
System.out.println(movies.html());
}
}
输出: -
<movie>
<id>
0
</id>
<name>
Aag - 1948
</name>
<link /> http://www.songspk.pk/indian/aag_1948.html
</movie>
<movie>
<id>
1
</id>
<name></name>
<link />#
</movie>
<movie>
<id>
2
</id>
<name>
Aa Ab Laut Chalain
</name>
<link />http://www.songspk.pk/aa_ab_laut_chalein.html
</movie>
<movie>
<id>
3
</id>
<name>
Aag - RGV Ki Aag
</name>
<link />http://www.songspk.pk/aag.html
</movie>
我想解析链接但由于此问题无法解决。 我想坚持使用Jsoup,因为我使用相同的库来创建以下xml文件......
答案 0 :(得分:1)
您是否尝试过使用Parser.xmlParser()
?
示例:
Document doc = Jsoup.parse(new File("c:/movies.xml"), "", Parser.xmlParser());
Elements movies = doc.getElementsByTag("movies");
System.out.println(movies.html());
应输出:
<movie>
<id>
0
</id>
<name>
Aag - 1948
</name>
<link>
http://www.songspk.pk/indian/aag_1948.html
</link>
</movie>
<movie>
<id>
1
</id>
<name></name>
<link>
#
</link>
</movie>
<movie>
<id>
2
</id>
<name>
Aa Ab Laut Chalain
</name>
<link>
http://www.songspk.pk/aa_ab_laut_chalein.html
</link>
</movie>
<movie>
<id>
3
</id>
<name>
Aag - RGV Ki Aag
</name>
<link>
http://www.songspk.pk/aag.html
</link>
</movie>
那么你可以正常提取<link>
标签:
Elements links = doc.getElementsByTag("link");