HtmlUnit和解密span元素

时间:2015-11-13 14:22:54

标签: java htmlunit

我正试图从网站上搜一个电话号码。

当我检查第二个条目中的电话号码时,Chrome中的检查员会给我以下结果:

    <span class="nummer">(012) 34 56 78</span>
    <span class="suffix encode_me telSelector129112728843_1306868" data-telselector="telSelector129112728843_1306868" data-telsuffix="IDEw"> 90</span>

然而,Htmlunit(和Chrome,如果我点击“show source”)显示以下内容:

    <span class="nummer">(012) 34 56 78</span>
    <span class="suffix encode_me telSelector129112728843_1306868" data-telselector="telSelector129112728843_1306868" data-telsuffix="IDEw"></span>

用Htmlunit获取最后一段电话号码的方法吗?

1 个答案:

答案 0 :(得分:0)

使用最新版本,我得到了它:

    try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) {
        String url = "http://www.gelbeseiten.de/schneider/hamburg";
        HtmlPage htmlPage = webClient.getPage(url);
        for (Object o : htmlPage.getByXPath("//span[@class='teilnehmertelefon']")) {
            System.out.println(((HtmlElement) o).asXml());
        }
    }

打印条目:

<span class="teilnehmertelefon">
  <span class="text nummer_ganz">
    <span class="nummer">
      (040) 78 80 89
    </span>
    <span class="suffix encode_me telSelector129112728843_3662885" data-telselector="telSelector129112728843_3662885" data-telsuffix="IDEw">
       10
    </span>
  </span>
</span>