我正在使用htmlunit来解析网站。以下代码失败:
WebClient wc = new WebClient(BrowserVersion.CHROME);
wc.getOptions().setCssEnabled(true);
wc.getOptions().setJavaScriptEnabled(true);
wc.getOptions().setThrowExceptionOnScriptError(true);
wc.waitForBackgroundJavaScript(10000);
wc.setJavaScriptTimeout(10000);
// wc.setAjaxController(new NicelyResynchronizingAjaxController());
wc.getOptions().setUseInsecureSSL(true);
final HtmlPage currentPage = wc.getPage("<website_url>");
如果我禁用了javascript wc.getOptions().setJavaScriptEnabled(true)
错误:
======= EXCEPTION START ========
Exception class=[java.lang.RuntimeException]
com.gargoylesoftware.htmlunit.ScriptException: Exception invoking constructor
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:894)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:637)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:518)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:774)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:750)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:102)
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:991)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:366)
at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute(HtmlScript.java:247)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:268)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:800)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:756)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1236)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1136)
at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:226)
at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:345)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3178)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2141)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:945)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:521)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:472)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:999)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:250)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:192)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:272)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:160)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:522)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:396)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:313)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:461)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:446)
Caused by: java.lang.NegativeArraySizeException
at com.gargoylesoftware.htmlunit.javascript.host.arrays.ArrayBuffer.constructor(ArrayBuffer.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at net.sourceforge.htmlunit.corejs.javascript.MemberBox.invoke(MemberBox.java:153)
... 46 more
答案 0 :(得分:0)
这是一个HtmlUnit错误。如果错误仍然存在,请使用最新的快照并打开问题。
在此期间,我找到了一些时间来查看HtmlUnit源代码。该错误现已修复,您可以从我们的ci服务器下载新版本(在构建完成后)。 将在本周推出新的SNAPSHOT版本。
答案 1 :(得分:0)
还有一个刷新的SNAPSHOT版本。