以下是我编写的代码段:
String url = "https://www.premierleague.com/tables";
doc = Jsoup.connect(url).get();
table = doc.select("table").first();
rank = table.select("td[id=tooltip]").iterator(); //Position
team = table.select("td[class=team]").iterator(); //Club
points = table.select("td[class=points]").iterator(); //Points
我可以获取位置,俱乐部和积分等数据,因为我可以用类名或ID识别它们,但我无法获取其他数据,如Played,Won,Draw,Loss,GF,GA,GD等
有人可以帮助我吗?
答案 0 :(得分:3)
您可以根据结构使用选择器,请参阅此示例,以获取 won 列中的第一个条目:http://try.jsoup.org/~camnKp8NJYL0meyfIRXEtV8E5B4
要使选择器正确,您可以使用Google Chrome中的开发者工具(f12),右键单击“元素”标签中的元素,然后选择Copy -> Copy selector
。
Iterator<Element> gamesPlayed = table.select("tbody tr > td:nth-child(4)").iterator();
Iterator<Element> gamesWon = table.select("tbody tr > td:nth-child(5)").iterator();
Iterator<Element> gamesDrawn = table.select("tbody tr > td:nth-child(6)").iterator();
Iterator<Element> gamesLost = table.select("tbody tr > td:nth-child(7)").iterator();
或者逐行解析表并存储单元格值,如下例所示:
示例代码
String userAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36";
String url = "https://www.premierleague.com/tables";
Document doc;
String position, club, played, won;
try {
doc = Jsoup.connect(url).userAgent(userAgent).get();
Element table = doc.select("table").first();
for (Element row : table.select("tr")) {
Elements cells = row.select("td");
if(cells.size()<5) continue;
position = cells.get(1).select(".value").first().text();
club = cells.get(2).select(".long").first().text();
played = cells.get(3).text();
won = cells.get(4).text();
System.out.println(position + " " + " " + club + "\n\tplayed: " + played + " won: " + won);
}
} catch (IOException e) {
e.printStackTrace();
}
<强>输出强>
1 Chelsea
played: 21 won: 17
2 Arsenal
played: 22 won: 14
3 Tottenham Hotspur
played: 22 won: 13
4 Liverpool
played: 22 won: 13
5 Manchester City
played: 22 won: 13
...