代理后面的HtmlUnit

时间:2016-10-07 11:03:19

标签: proxy htmlunit

我使用HtmlUnit从网站获取一些信息。当我不在代理后面时它会起作用。在代理后面工作时,我使用此代码来设置配置:

WebClient webClient = new WebClient(BrowserVersion.FIREFOX_45);
ProxyConfig proxyConfig = new ProxyConfig();
proxyConfig.setProxyAutoConfigUrl(proxyAutoConfigUrl);
webClient.getOptions().setProxyConfig(proxyConfig);
HtmlPage page = webClient.getPage(config.read().getString("homepage.url"));
FrameWindow fw = page.getFrameByName("mainDataIframe");
...

但我总是收到错误:

    ERROR AWT-EventQueue-0 com.gargoylesoftware.htmlunit.html.HtmlPage - Error loading JavaScript from [some website].
java.io.IOException: Unable to download JavaScript from 'some website' (status 404).
    at com.gargoylesoftware.htmlunit.html.HtmlPage.loadJavaScriptFromUrl(HtmlPage.java:1040)
    at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:967)
    at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:352)
    at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute(HtmlScript.java:238)
    at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:257)
    at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:772)
    at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
    at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:729)
    at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1209)
    at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1111)
    at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:207)
    at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:337)
    at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3137)
    at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2100)
    at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:927)
    at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:506)
    at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:459)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:979)
    at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:241)
    at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:187)
    at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:269)
    at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:157)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:511)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:385)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:303)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:450)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:435)
    at com.bayer.elwasweb.PageReader.messstellenDurchlaufen(PageReader.java:99)
    at com.bayer.elwasweb.PageReader.start(PageReader.java:70)
    at com.bayer.elwasweb.Gui.actionPerformed(Gui.java:211)
    at javax.swing.AbstractButton.fireActionPerformed(Unknown Source)
    at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source)
    at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source)
    at javax.swing.DefaultButtonModel.setPressed(Unknown Source)
    at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(Unknown Source)
    at java.awt.Component.processMouseEvent(Unknown Source)
    at javax.swing.JComponent.processMouseEvent(Unknown Source)
    at java.awt.Component.processEvent(Unknown Source)
    at java.awt.Container.processEvent(Unknown Source)
    at java.awt.Component.dispatchEventImpl(Unknown Source)
    at java.awt.Container.dispatchEventImpl(Unknown Source)
    at java.awt.Component.dispatchEvent(Unknown Source)
    at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
    at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
    at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
    at java.awt.Container.dispatchEventImpl(Unknown Source)
    at java.awt.Window.dispatchEventImpl(Unknown Source)
    at java.awt.Component.dispatchEvent(Unknown Source)
    at java.awt.EventQueue.dispatchEventImpl(Unknown Source)
    at java.awt.EventQueue.access$500(Unknown Source)
    at java.awt.EventQueue$3.run(Unknown Source)
    at java.awt.EventQueue$3.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source)
    at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source)
    at java.awt.EventQueue$4.run(Unknown Source)
    at java.awt.EventQueue$4.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source)
    at java.awt.EventQueue.dispatchEvent(Unknown Source)
    at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
    at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
    at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
    at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
    at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
    at java.awt.EventDispatchThread.run(Unknown Source)
ERROR AWT-EventQueue-0 PageReader - com.gargoylesoftware.htmlunit.ElementNotFoundException: elementName=[frame or iframe] attributeName=[name] attributeValue=[mainDataIframe]

如果没有代理或在浏览器中,此错误永远不会发生。是否有人有想法,为什么会发生这种错误?或者也许有人知道如何在代理背后工作时找到差异?

1 个答案:

答案 0 :(得分:1)

我们想通了!异常/错误具有误导性。

错误消息导致我们得出结论,问题是没有加载引发异常的JavaScript文件。实际上整个页面都没有加载。

代理只需要一些凭据。我们添加了

    final DefaultCredentialsProvider credentialsProvider = 
       (DefaultCredentialsProvider) webClient.getCredentialsProvider();

    credentialsProvider.addCredentials(username, password);

到WebClient configuartaions。