我使用HtmlUnit抓取网页内容(有javascript)
Stopwatch timer = new Stopwatch().start();
final WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setActiveXNative(false);
webClient.getOptions().setAppletEnabled(false);
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setDoNotTrackEnabled(true);
webClient.getOptions().setGeolocationEnabled(false);
webClient.getOptions().setPopupBlockerEnabled(true);
webClient.getOptions().setPrintContentOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setUseInsecureSSL(true);
webClient.setCssErrorHandler(new SilentCssErrorHandler());
webClient.getCookieManager().setCookiesEnabled(false);
webClient.getOptions().setRedirectEnabled(false);
webClient.getOptions().setTimeout(900);
System.out.println("1.0 : " + timer.elapsed(TimeUnit.MILLISECONDS));
final HtmlPage page = webClient.getPage(url);
System.out.println(page.getWebResponse().getLoadTime());
System.out.println("1.1 : " + timer.elapsed(TimeUnit.MILLISECONDS));
然后我得到的结果是这样的
1.0 : 2
707
1.1 : 8003
为什么webclient消耗更多时间(超过webresponse加载时间)?我怎样才能缩短那段时间?