Jsoup刮取谷歌搜索结果并从右侧的按钮获取网站链接

时间:2017-03-25 12:10:50

标签: java jsoup

目前我已经完成了这项功能。

private static void getAllLinks(String URL) {
    try {
        doc = Jsoup.connect(URL).userAgent("Chrome").ignoreHttpErrors(true).get();

        Elements links = doc.select("div._mdf._ykh.kno-fb-ctx3 > a");
        print("\nLinks: (%d)", links.size());
        for (Element link : links) {
             print("%s", link.attr("abs:href"));
            mainLinks.add(link.attr("abs:href"));
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}

,页面链接为https://www.google.com/search?q=Ajo+Calvary+Baptist+Church

see Image

1 个答案:

答案 0 :(得分:0)

找到解决方案,请参阅上面的评论以获得解释。你需要HTMLUNIT而不是Jsoup。

    try (WebClient webClient = new WebClient()) {
        final HtmlPage page = webClient.getPage("https://www.google.com/search?q=Ajo+Calva‌​ry+Baptist+Church");
        HtmlAnchor el = (HtmlAnchor) page.getByXPath("//a[@class='ab_button']").get(0);
        System.out.println(el.getAttribute("href"));
    } catch (Exception e) {
        e.printStackTrace();
    }