Java jsoup表分析要么省略一行,要么抛出索引超出范围的异常

时间:2015-02-26 18:48:47

标签: java jsoup

我昨天偶然发现了jsoup库的一个问题。

public class Analyse {
    public static void main(String[] args) throws IOException, FileNotFoundException {

        try {
            PrintStream output = new PrintStream(new File("E://eBot1.txt"));
            System.setOut(output);
        }
        catch (FileNotFoundException fx) {
            System.out.println(fx);
        }

        for (int i = 1527; i < 1542; i++) {
            String url = "http://csgolive.eslproseries.de/matchs/view/" + i + "#stats-players";
            Document doc = Jsoup.connect(url).get();
            String MatchID = doc.select("h4").text();
            System.out.println("\n\n" + "Spiel: " + MatchID + "\n\n");
            for (Element table : doc.select("table[id=tablePlayers]")) {
                for (Element row : table.select("tr")) {
                    Elements tds2 = row.select("td:not([rowspan])");
                    int vsTwo = Integer.parseInt(tds2.get(13).text());
                    int vsThree = Integer.parseInt(tds2.get(14).text());
                    int vsFour = Integer.parseInt(tds2.get(15).text());
                    int vsFive = Integer.parseInt(tds2.get(16).text());
                    int fourKills = Integer.parseInt(tds2.get(20).text());
                    int fiveKills = Integer.parseInt(tds2.get(21).text());
                    if (vsTwo > 0) {
                        System.out.println("Team: " + tds2.get(0).text() + " Player: " + tds2.get(1).text() + " 1v2 Clutch: " + tds2.get(13).text());
                    }
                    if (vsThree > 0) {
                        System.out.println("Team: " + tds2.get(0).text() + " Player: " + tds2.get(1).text() + " 1v3 Clutch: " + tds2.get(14).text());
                    }
                    if (vsFour > 0) {
                        System.out.println("Team: " + tds2.get(0).text() + " Player: " + tds2.get(1).text() + " 1v4 Clutch: " + tds2.get(15).text());
                    }
                    if (vsFive > 0) {
                        System.out.println("Team: " + tds2.get(0).text() + " Player: " + tds2.get(1).text() + " 1v5 Clutch: " + tds2.get(16).text());
                    }
                    if (fourKills > 0) {
                        System.out.println("Team: " + tds2.get(0).text() + " Player: " + tds2.get(1).text() + " 4 Kills: " + tds2.get(20).text());
                    }
                    if (fiveKills > 0) {
                        System.out.println("Team: " + tds2.get(0).text() + " Player: " + tds2.get(1).text() + " 5 Kills: " + tds2.get(21).text());
                    }
                    else {
                    }
                }
            }
        }
    }
}

所以基本上,我想从这个(http://csgolive.eslproseries.de/matchs/view/1529#stats-players)页面分析一个表。在该URL中,matchID(在这种情况下为1529)应该由for循环确定。这工作正常,jsoup.connect语句也是如此。所以我从网站上获得了html代码。现在我想从表中过滤出适合团队的别名。命令应该是

for (Element row : table.select("tr"));

这虽然让我遇到以下错误:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 13, Size: 0
    at java.util.ArrayList.rangeCheck(Unknown Source)
    at java.util.ArrayList.get(Unknown Source)
    at org.jsoup.select.Elements.get(Elements.java:544)
    at org.jsoup.Analyse.main(Analyse.java:42)

我已经尝试过避免这种情况的方法。唯一的工作方法是放

for (Element row : table.select("tr:gt(0)"));

这只加载表条目2-10,所以我总是错过第一个。

所以我的问题是: 有没有办法不得到这个错误,仍然得到所有表行?

编辑:我编写了另一种可能性,它揭示了真正的错误:第一行要在tds2.get(13).text()。get(14)中解析。等等给出了IndexOutOfBoundsException。所以第一行显然是空的。但是,当我将它打印到控制台时,一切都很正常。

解决了:

代替

for (Element row : table.select("tr")) {

使用

for (int f = 1; f < 11 ; f++) {

1 个答案:

答案 0 :(得分:0)

您正在解析的表是否有标题?好像第一行有单元格而不是这是预期的行为......