Javascript不会在使用HtmlUnit和PhantomJS的网站上执行

时间:2016-04-01 20:32:51

标签: java selenium phantomjs htmlunit

我试图首先使用HtmlUnit来获取网页的html源代码,而不是使用PhantomJS,但两者都让我失望。我得到的页面源包含Javascript,似乎没有被执行。我没有'真的明白发生了什么。我试过的HtmlUnit版本:

webClient = new WebClient(BrowserVersion.FIREFOX_38);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.waitForBackgroundJavaScript(10000);
webClient.getOptions().setThrowExceptionOnScriptError(true);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);

HtmlPage page = webClient.getPage("https://www.flickr.com/search/?text=cats&view_all=1");
webClient.close();

System.out.println(page.asXml());

幻影版本:

File phantomjs = Phanbedder.unpack();
DesiredCapabilities dcaps = new DesiredCapabilities();
dcaps.setJavascriptEnabled(true);
dcaps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, phantomjs.getAbsolutePath());
dcaps.setCapability("phantomjs.page.settings.userAgent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36");

driver = new PhantomJSDriver(dcaps);
driver.manage().timeouts().setScriptTimeout(10, TimeUnit.SECONDS);
driver.get("https://www.flickr.com/search/?text=cats&view_all=1");
System.out.println(driver.getPageSource());

如果有人可以帮助我,我将非常感激。感谢。

1 个答案:

答案 0 :(得分:0)

我不会过分暗示这一点。在Firefox上,安装(web)developers toolbar。点击查看来源 - >生成的源。