获取第二个最后列表项Jsoup的值

时间:2016-05-16 10:31:54

标签: java web-scraping jsoup

我有 HTML code ,如下所示

<div id="wishlistPagination" class="a-container">
<span class="a-declarative" data-action="ajax-pagination" data-ajax-pagination="{}">
    <div class="a-text-center">
        <ul class="a-pagination">
            <li class="a-disabled">&larr;
                <span class="a-letter-space"></span>
                <span class="a-letter-space"></span>Previous
            </li>
            <li data-action="pag-trigger" data-pag-trigger="{&quot;page&quot;:1}" class="a-selected">
                <a href="/gp/registry/wishlist/3C96S5RO2A5A9/ref=cm_wl_sortbar_v_page_1">1</a>
            </li>
            <li data-action="pag-trigger" data-pag-trigger="{&quot;page&quot;:2}" class="a-">
                <a href="/gp/registry/wishlist/3C96S5RO2A5A9/ref=cm_wl_sortbar_v_page_2?ie=UTF8&page=2">2</a>
            </li>
            <li data-action="pag-trigger" data-pag-trigger="{&quot;page&quot;:3}" class="a-">
                <a href="/gp/registry/wishlist/3C96S5RO2A5A9/ref=cm_wl_sortbar_v_page_3?ie=UTF8&page=3">3</a>
            </li>
            <li data-action="pag-trigger" data-pag-trigger="{&quot;page&quot;:4}" class="a-">
                <a href="/gp/registry/wishlist/3C96S5RO2A5A9/ref=cm_wl_sortbar_v_page_4?ie=UTF8&page=4">4</a>
            </li>
            <li data-action="pag-trigger" data-pag-trigger="{&quot;page&quot;:5}" class="a-">
                <a href="/gp/registry/wishlist/3C96S5RO2A5A9/ref=cm_wl_sortbar_v_page_5?ie=UTF8&page=5">5</a>
            </li>
            <li data-action="pag-trigger" data-pag-trigger="{&quot;page&quot;:6}" class="a-">
                <a href="/gp/registry/wishlist/3C96S5RO2A5A9/ref=cm_wl_sortbar_v_page_6?ie=UTF8&page=6">6</a>
            </li>
            <li data-action="pag-trigger" data-pag-trigger="{&quot;page&quot;:7}" class="a-">
                <a href="/gp/registry/wishlist/3C96S5RO2A5A9/ref=cm_wl_sortbar_v_page_7?ie=UTF8&page=7">7</a>
            </li>
            <li data-action="pag-trigger" data-pag-trigger="{&quot;page&quot;:&quot;&amp;hellip;&quot;}" class="a-disabled">&hellip;</li>
            <li data-action="pag-trigger" data-pag-trigger="{&quot;page&quot;:9}" class="a-">
                <a href="/gp/registry/wishlist/3C96S5RO2A5A9/ref=cm_wl_sortbar_v_page_9?ie=UTF8&page=9">9</a>
            </li>
            <li class="a-last">
                <a href="/gp/registry/wishlist/3C96S5RO2A5A9/ref=cm_wl_sortbar_v_page_2?ie=UTF8&page=2">Next
                    <span class="a-letter-space"></span>
                    <span class="a-letter-space"></span>&rarr;
                </a>
            </li>
        </ul>
    </div>
</span>

Java代码

  Document doc = Jsoup.connect("https://www.sample_url.com").timeout(10 * 1000).post();
  Elements pages = doc.select("li[class*=a-last]");
  System.out.println("Value of List Item"+pages.get(0).text());

在上面的例子中,我试图获取最后一个的值&#34; li&#34;标签是&#34; 9&#34;在这种情况下,它是动态的,在某些情况下,它也可以是100。目前,我可以获取最后一个&#34; li&#34;标签

输出

Next

期望输出

9

无法理解如何获取所需的值。请帮忙..

2 个答案:

答案 0 :(得分:1)

不确定这是否适合您,但您可以使用JSoup Elements last() method

    Document doc = Jsoup.connect("https://www.sample_url.com").timeout(10 * 1000).post();
    Element lastPagTrigger = doc.select("li[data-action=pag-trigger]").last();
    System.out.println("Value of List Item" + lastPagTrigger.text());

答案 1 :(得分:0)

假设a-last分类项目是第一个,在代码块之后获取前一个元素兄弟文本。

Element page = doc.select("li[class*=a-last]").first();
System.out.println("Value of List Item : " + page.previousElementSibling().text());

Tom Mac的答案也很明确,可以通过其数据属性直接获得最后一个列表元素。