HTMLUnite无限循环stackoverflow

时间:2013-01-05 02:57:22

标签: java javascript web html-parsing htmlunit

我正在尝试使用HTMLUnit获取此页面,但似乎HTMLUnit会进行无限循环并崩溃。

我一直试图找到原因,但我放弃了。 我试过了:

  • 从SVN HTMLUnit获取最新代码
  • 使用不同的浏览器
  • 尝试调试,但我找不到原因。

如果我使用     webClient.setJavaScriptEnabled(假);

我可以访问该网站,但不会执行重要的脚本。所以,问题在于JavaScript。

这是一个运行循环的示例代码:

String url = "http://www.tjpe.jus.br/processos/consulta1grau/oleConsultaProcesso.asp?
nume=00123335620128170990&modalidade=6";

    final List collectedAlerts = new ArrayList();
    final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_17);
    webClient.setThrowExceptionOnScriptError(false);
    webClient.setAlertHandler(new CollectingAlertHandler(collectedAlerts));
    webClient.waitForBackgroundJavaScript(10000);
    webClient.waitForBackgroundJavaScriptStartingBefore(10000);
    try {
        final HtmlPage page1 = webClient
        .getPage(url);

        page1.asXml();
    } catch (FailingHttpStatusCodeException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (MalformedURLException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

这是错误:

Jan 4, 2013 11:39:05 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Jan 4, 2013 11:39:05 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Jan 4, 2013 11:39:43 PM com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine     
handleJavaScriptException
INFO: Caught script exception
======= EXCEPTION START ========
Exception class=[java.lang.RuntimeException]
com.gargoylesoftware.htmlunit.ScriptException: Exception invoking go
at   
 com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run
(JavaScriptEngine.java:665)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:587)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call
(ContextFactory.java:534)

...

Caused by: java.lang.StackOverflowError
at java.lang.reflect.Method.copy(Unknown Source)
at java.lang.reflect.ReflectAccess.copyMethod(Unknown Source)

....

    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:432)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:307)
======= EXCEPTION END ========

我正在使用的网址是公开的,因此可以自由访问。

如果有人能帮助我,请感谢。

1 个答案:

答案 0 :(得分:0)

您尝试访问的网址似乎有一些Javascript错误。尝试

webClient.setThrowExceptionOnScriptError(false);