如何使用jsoup计算div的数量?

时间:2013-10-17 22:32:44

标签: java android html parsing jsoup

如何使用jsoup计算div的数量?

我需要做的是计算所有“news_main”div ...

         <h1>Notice to Mariners</h1>
         <form name="filter-form" id="filter-form" action="/notice-to-mariners/"       enctype="multipart/form-data" accept-charset="UTF-8" method="post">
    <div style="display: none"><input type="hidden" name="filter-form" value="1"></div>
    <div style="display:none; width:0px; height:0px;"><p><label class="indent" for="filter-form-leave_blank">If you are human leave this blank:</label><input id="filter-form-leave_blank" class="" type="text" name="filter-form-leave_blank" value=""></p></div><div id="filter"><select class="" name="filter" id="filter-form-filter">
    <option value="form_error">View notices in force</option>
    <option value="1">View notices not in force</option>
    <option value="2">View all notices</option>
    </select><button type="submit">Filter</button></div><!-- / filter --></form>
        <div class="news_main">
        <div class="news_main">
        <div class="news_main">
        <div class="news_main">
        <div class="news_main">

etc..etc

我尝试了各种方法,但似乎都返回0?

代码:

 docNtm = Jsoup.connect("http://www.mhpa.co.uk/notice-to-mariners/").timeout(600000).get();
                           Elements ntmAmount = docNtm.select("div.news_main div"); 

                            System.out.println("size:  " + ntmAmount.size());  

感谢您的任何建议。

编辑:

我现在可以像这样检索所有div:

 10-18 22:41:36.365: I/System.out(14624): size:  0
 10-18 22:41:36.365: I/System.out(14624): size:  0
 10-18 22:41:36.365: I/System.out(14624): size:  0
 10-18 22:41:36.365: I/System.out(14624): size:  0
 10-18 22:41:36.365: I/System.out(14624): size:  0
 10-18 22:41:36.365: I/System.out(14624): size:  0
  .....etc

计算它们的最佳方式是什么?

感谢

3 个答案:

答案 0 :(得分:3)

Element.getElementsByTag("div");Element.hasClass("news_main");

一起使用
Document doc = Jsoup.parse(input, "UTF-8", "http://www.mhpa.co.uk/notice-to-mariners/");

Element content = doc.getElementById("content");
Elements divs = content.getElementsByTag("div");
int ntmAmount = 0;
for (Element div : divs) {
  if (div.hasClass("news_main"))
    ntmAmount++;
}

Element.getElementsByClass("news_main");

...
Elements ntmDivs = content.getElementsByClass("news_main");
int ntmAmount = ntmDivs.size();

答案 1 :(得分:0)

替换:

Elements ntmAmount = docNtm.select("div.news_main div"); 

用这个:

Elements ntmAmount = docNtm.select("div.news_main"); 

答案 2 :(得分:0)

我不确定直接方法,但for循环应该有效......

int count = 0;
for (Element n : ntmAmount) {
    count++;
}

这是假设ntmAmount是对你想要的所有<div>元素的引用......正如其他人指出的那样,它不是。

相反,您需要Elements ntmAmount = docNtm.select("div.news_main")