我正试图从网站上搜一个电话号码。
当我检查第二个条目中的电话号码时,Chrome中的检查员会给我以下结果:
<span class="nummer">(012) 34 56 78</span>
<span class="suffix encode_me telSelector129112728843_1306868" data-telselector="telSelector129112728843_1306868" data-telsuffix="IDEw"> 90</span>
然而,Htmlunit(和Chrome,如果我点击“show source”)显示以下内容:
<span class="nummer">(012) 34 56 78</span>
<span class="suffix encode_me telSelector129112728843_1306868" data-telselector="telSelector129112728843_1306868" data-telsuffix="IDEw"></span>
用Htmlunit获取最后一段电话号码的方法吗?
答案 0 :(得分:0)
使用最新版本,我得到了它:
try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) {
String url = "http://www.gelbeseiten.de/schneider/hamburg";
HtmlPage htmlPage = webClient.getPage(url);
for (Object o : htmlPage.getByXPath("//span[@class='teilnehmertelefon']")) {
System.out.println(((HtmlElement) o).asXml());
}
}
打印条目:
<span class="teilnehmertelefon">
<span class="text nummer_ganz">
<span class="nummer">
(040) 78 80 89
</span>
<span class="suffix encode_me telSelector129112728843_3662885" data-telselector="telSelector129112728843_3662885" data-telsuffix="IDEw">
10
</span>
</span>
</span>