Question

我正在开发一个项目，我需要解析HTML以从网页中提取数据。我在Java中使用Jsoup。我需要从以下内容中提取数据。

<tr>
            <td><small><a href="http://www.timeanddate.com/worldclock/fixedtime.html?iso=20160821T2100&amp;p1=248" target="_blank">2016/08/21 21:00</a></small></td>
            <td><small><a href="https://agc003.contest.atcoder.jp">AtCoder Grand Contest 003</a></small></td>

</tr>

我可以获取值竞赛名称和时间但是如何提取URL。我想获得比赛网址https://agc003.contest.atcoder.jp 怎么弄这个？

修改： 这是我的代码



private void getAC() throws IOException {

    Document doc = Jsoup.connect("https://atcoder.jp/").userAgent(Desktop.getDesktop().toString()).get();
    Element table = doc.getElementsByClass("table-responsive").get(1);
    Elements contestStartTime = table.getElementsByTag("td");
    int cnt = 1;
    for (Element i : contestStartTime) {
        System.out.println(cnt + ". " + i.html());
        cnt++;
    }

}

private void getAC() throws IOException { Document doc = Jsoup.connect("https://atcoder.jp/").userAgent(Desktop.getDesktop().toString()).get(); Element table = doc.getElementsByClass("table-responsive").get(1); Elements contestStartTime = table.getElementsByTag("td"); int cnt = 1; for (Element i : contestStartTime) { System.out.println(cnt + ". " + i.html()); cnt++; } }

Answer 1

JSoup为DOM处理提供了丰富的api，寻找这个函数：

此外，您可以通过这种方式获取链接

解析HTML href属性

1 个答案: