Question

我是网络抓取的新手，所以这个问题可能没有完美构思。我试图从某个给定页面中提取所有药物名称链接，结果提取所有az药物链接，然后迭代这些链接从这些链接中提取信息，如通用名称，品牌等。我有一个非常基本的下面的代码不起作用。一些帮助解决这个问题将非常感激。

public class WebScraper {
  public static void main(String[] args) throws Exception {

    String keyword = "a"; //will iterate through all the alphabets eventually
    String url = "http://www.medindia.net/drug-price/brand-index.asp?alpha=" + keyword; 

    Document doc = Jsoup.connect(url).get();
    Element table = doc.select("table").first();
    Elements links = table.select("a[href]"); // a with href
    for (Element link : links) {
    System.out.println(link.attr("href"));
  }
}

Answer 1

在查看网站以及您期望获得的内容之后，看起来您正在抓住错误的表格元素。你不想要第一张桌子，你想要第二张桌子。

要获取特定的表格，您可以使用：

Element table = doc.select("table").get(1);

这将获得索引1处的表，即文档中的第二个表。

使用Jsoup从表格和网站的所有选项卡获取链接

1 个答案: