Question

我正在尝试从网页中的所有标题标记<h3>中提取链接（标题及其地址）。

我尝试的代码是：

String u="http://www.thehindu.com/business/";
Document docu = (Document) Jsoup.connect(u).get();

Elements lnk = docu.select("h3");
  for (Element an : lnk) {
      String s= an.attr("abs:href");

        String name = an.text();
        System.out.println( s);

 }

我没有得到任何输出。有什么问题？

Answer 1

您选择h3并尝试阅读其href属性，但h3没有<h3 href="foobar">属性（没有a）。您要选择的是h3，其位于href内，并从中读取String u = "http://www.thehindu.com/business/"; Document docu = (Document) Jsoup.connect(u).get(); Elements lnk = docu.select("h3 a[href]"); for (Element an : lnk) { String s = an.attr("abs:href"); String name = an.text(); System.out.println(name); System.out.println(s); System.out.println("--------"); }值。

所以你的代码应该更像

{{1}}

使用jSoup从所有标题标记中提取链接

1 个答案: