JSoup从网页获取特定数据

时间:2015-02-01 11:09:11

标签: java web-scraping jsoup webpage

我一直在尝试从http://www.betvictor.com/sports/en/to-lead-anytime获取数据,我希望使用JSoup获取匹配列表。

例如: Caen v AS Saint Etienne Celtic v Rangers

依旧......

我目前的代码是:

String couponPage = "http://www.betvictor.com/sports/en/to-lead-anytime";
Document doc1 = Jsoup.connect(couponPage).get();


    String match = doc1.select("#coupon_143751140 > table:nth-child(3) > tbody > tr:nth-child(2) > td.event_description").text();
    System.out.println("match:" + match);

一旦我弄清楚如何获取一项数据,我会将它放在for循环中以遍历整个表格,但首先我需要获取一项数据。

目前,输出是"匹配:"所以它看起来像"匹配"变量是空的。

非常感谢任何帮助,

1 个答案:

答案 0 :(得分:0)

经过几个小时的实验,我找到了如何回答我的问题。结果是页面没有正确加载,并且必须实现" timeout"方法

Document doc;
try {

    // need http protocol
    doc = Jsoup.connect("http://www.betvictor.com/sports/en/football/coupons/100/0/0/43438/0/100/0/0/0/0/1").timeout(10000).get();

    // get all links
    Elements matches = doc.select("td.event_description a");
    Elements odds = doc.select("td.event_description a");
    for (Element match : matches) {

        // get the value from href attribute

        String matchEvent = match.text();
        String[] parts = matchEvent.split(" v ");
        String team1 = parts[0];
        String team2 = parts[1];
        System.out.println("text : " + team1 + " v " + team2);

    }