使用链接名称刮取链接

时间:2017-12-20 11:27:18

标签: java web-scraping jsoup

我正在尝试使用jsoup废弃链接 enter image description here 两个链接完全相同,但我想只获取第二个任何建议吗?

我尝试了这个,但它无法正常工作

 Element pagination2 = document3.select("div.pagination").first();
 Elements Link2 =pagination2.select("a.older");

2 个答案:

答案 0 :(得分:1)

应该很简单,这看起来应该如下所示

public static void main(String[] args) throws IOException {
        final String url = "https://github.com/apple/turicreate/commits/master?after=b7432a7e73c8efa0466e7b338f2717d392ba1f72+34";
        final Document doc = Jsoup.connect(url).get();
        final Elements elements = doc.select("div.pagination a"); // get all "a" elements

        // get the second element via index
        final Element secondElement = elements.get(1);
        // get the href attribute (link)
        final String href = secondElement.attr("href");
        // get the text of second element
        final String older = secondElement.text();
        System.out.println(href +" "+older);
    }

答案 1 :(得分:0)

我用这个解决了它

Element pagination2 = document3.select("div.pagination a").get(0); 这将给出第一个链接和

Element pagination2 = document3.select("div.pagination a").get(1);

这将给出第二个链接