如何使用Jsoup从链接中提取href?

时间:2011-08-03 21:38:12

标签: java jsoup

我想得到这个链接:

index.php?limitstart=0&picno=0&gallery_key=92
index.php?limitstart=0&picno=1&gallery_key=92
index.php?limitstart=0&picno=2&gallery_key=92

来自这个使用Jsoup的html:

<tr> 
<td style="padding: 8px;"><a onclick="redx_gallery_showImage(0);return false;" href="/module/gallery/index.php?limitstart=0&amp;picno=0&amp;gallery_key=92"><img width="90" height="90"  style='border: 1px #BAB9AF solid'   src='/redx_tools/mb_image.php/cid.077117104075119048121090118052048061/gid.10/pyrit_club_2_buche.jpg' border='1'    alt=''/></a></td> 
    <td style="padding: 8px;"><a onclick="redx_gallery_showImage(1);return false;" href="/module/gallery/index.php?limitstart=0&amp;picno=1&amp;gallery_key=92"><img width="90" height="90"  style='border: 1px #BAB9AF solid'   src='/redx_tools/mb_image.php/cid.085057100083102116053082117052115061/gid.10/pyrit_club_2_weiss.jpg' border='1'    alt=''/></a></td> 
    <td style="padding: 8px;"><a onclick="redx_gallery_showImage(2);return false;" href="/module/gallery/index.php?limitstart=0&amp;picno=2&amp;gallery_key=92"><img width="90" height="90"  style='border: 1px #BAB9AF solid'   src='/redx_tools/mb_image.php/cid.120068065087108097121088078055048061/gid.10/pyrit_club_2_wei_2.jpg' border='1'    alt=''/></a></td> 
</tr> 

有什么想法吗?谢谢

1 个答案:

答案 0 :(得分:4)

您需要知道公共容器元素的id,以便您可以在一个CSS选择中获取它们。根据{{​​3}},它是<div id="redx_gallery_thumb_list">

所以,这应该做:

Elements links = document.select("#redx_gallery_thumb_list a");

for (Element link : links) {
    String href = link.attr("href");

    // Or if you want to have absolute URL instead, so that you can leech them.
    String absUrl = link.absUrl("href");

    // ...
}