我使用此功能从本网站获取商品并返回列表。
Document doc = null;
try {
doc = Jsoup.connect("http://www.gamespy.com/index/release.html").get();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
// Get all td's that are a child of a row - each game has 4 of these
Elements games = doc.select("tr> td.indexList1, tr > td.indexList2");
// Iterator over those elements
ListIterator<Element> postIt = games.listIterator();
while (postIt.hasNext()) {
// ...It
while (postIt.hasNext()) {
// Add the game text to the ArrayList
String name = postIt.next().text();
String platform = postIt.next().text();
String genre = postIt.next().text();
String releaseDate = postIt.next().text();
gameList.add(new GameRelease(name, platform, genre, releaseDate));
Log.v(TAG, name +platform + genre +releaseDate);
}
这是每个项目的html
<tr>
<td class="indexList1" align="left">
<a href="http://pc.gamespy.com/pc/hacker-evolution-duality-/" class="b1">
<em>Hacker Evolution Duality </em>
</a>
</td>
<td class="indexList1" align="center">
PC
</td>
<td class="indexList1" align="center">
Adventure
</td>
<td class="indexList1" align="center">
August 15, 2011
<!--08/15/2011-->
</td>
每个项目都有相同的模式,但我想知道我是否可以检索每个项目的网址。你们可能需要查看html的来源,这也是一个更好的主意。
但我想将每个项目的url存储在一个字符串中。
答案 0 :(得分:2)
while (postIt.hasNext()) {
// Get the title of the game
Element title = postIt.next();
System.out.println(title.text());
// Get the anchor element
Element url = title.select("a").first();
// Get the URL here @@@
System.out.println(url.attr("href"));
// Unneeded elements
Element platform = postIt.next();
Element genre = postIt.next();
// Get the release date of the game
Element release = postIt.next();
System.out.println(release.text() + "\n@@@@@@");
}
编辑:在你的情况下:
Element name = postIt.next();
String nameString = name.text();
Element url = name.select("a").first();
String urlString = url.attr("href");
答案 1 :(得分:1)
每个项目都有相同的模式,但我想知道我是否可以检索每个项目的网址。
Elements links = doc.getElementsByTag("a"); // or getElementsByClass("b1");
ListIterator<Element> postIt = games.listIterator();
while (postIt.hasNext()) {
String linkHref = link.attr("href");
}
}