Selenium HTMLUNIT忽略Python中的JS错误

时间:2018-01-14 20:07:43

标签: python selenium htmlunit-driver

我使用Selenium和启用了Javascript的HTMLUnit来阅读Python中的网站。不幸的是,我遇到了没有最干净的Javascript的网站的问题。例如:

from selenium import webdriver

try:
    browser = webdriver.Remote(desired_capabilities=webdriver.DesiredCapabilities.HTMLUNITWITHJS)
    browser.get('https://www.ebay.com/')
    browser.close()
    print('success')
except Exception as e:
    print(e)

这会导致错误,就像python通过webdriver传递javascript错误一样。请注意,Chrome,Firefox或IE网络驱动程序不会发生这种情况。

例外e:

TypeError: Cannot read property "classList" from undefined (script in https://www.ebay.com/ from (46, 26) to (73, 78)#70)
Stacktrace:
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError (ScriptRuntime.java:4130)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError (ScriptRuntime.java:4108)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError (ScriptRuntime.java:4141)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError2 (ScriptRuntime.java:4160)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.undefReadError (ScriptRuntime.java:4173)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getObjectProp (ScriptRuntime.java:1528)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop (Interpreter.java:1245)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret (Interpreter.java:815)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call (InterpretedFunction.java:111)
at net.sourceforge.htmlunit.corejs.javascript.NativeArray.iterativeMethod (NativeArray.java:1671)
at net.sourceforge.htmlunit.corejs.javascript.NativeArray.execIdCall (NativeArray.java:353)
at net.sourceforge.htmlunit.corejs.javascript.IdFunctionObject.call (IdFunctionObject.java:101)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop (Interpreter.java:1484)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret (Interpreter.java:815)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call (InterpretedFunction.java:111)
at net.sourceforge.htmlunit.corejs.javascript.NativeArray.iterativeMethod (NativeArray.java:1671)
at net.sourceforge.htmlunit.corejs.javascript.NativeArray.execIdCall (NativeArray.java:353)
at net.sourceforge.htmlunit.corejs.javascript.IdFunctionObject.call (IdFunctionObject.java:101)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop (Interpreter.java:1484)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret (Interpreter.java:815)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call (InterpretedFunction.java:111)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall (ContextFactory.java:417)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall (HtmlUnitContextFactory.java:325)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall (ScriptRuntime.java:3424)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec (InterpretedFunction.java:122)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$3.doRun (JavaScriptEngine.java:781)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run (JavaScriptEngine.java:895)
at net.sourceforge.htmlunit.corejs.javascript.Context.call (Context.java:599)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call (ContextFactory.java:527)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute (JavaScriptEngine.java:790)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute (JavaScriptEngine.java:766)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute (JavaScriptEngine.java:757)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScript (HtmlPage.java:920)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded (HtmlScript.java:316)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded (HtmlScript.java:396)
at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute (HtmlScript.java:246)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage (HtmlScript.java:267)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement (HTMLParser.java:805)
at org.apache.xerces.parsers.AbstractSAXParser.endElement (None:-1)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement (HTMLParser.java:761)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement (HTMLTagBalancer.java:1236)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement (HTMLTagBalancer.java:1136)
at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement (DefaultFilter.java:226)
at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement (NamespaceBinder.java:345)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement (HTMLScanner.java:3178)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan (HTMLScanner.java:2141)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument (HTMLScanner.java:945)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse (HTMLConfiguration.java:521)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse (HTMLConfiguration.java:472)
at org.apache.xerces.parsers.XMLParser.parse (None:-1)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse (HTMLParser.java:1004)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse (HTMLParser.java:253)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml (HTMLParser.java:195)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage (DefaultPageCreator.java:267)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage (DefaultPageCreator.java:158)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto (WebClient.java:524)
at com.gargoylesoftware.htmlunit.WebClient.getPage (WebClient.java:398)
at com.gargoylesoftware.htmlunit.WebClient.getPage (WebClient.java:315)
at org.openqa.selenium.htmlunit.HtmlUnitDriver.get (HtmlUnitDriver.java:670)
at org.openqa.selenium.htmlunit.HtmlUnitDriver.lambda$get$8 (HtmlUnitDriver.java:657)
at org.openqa.selenium.htmlunit.HtmlUnitDriver.lambda$runAsync$0 (HtmlUnitDriver.java:414)
at java.lang.Thread.run (None:-1)

我找到了以下适用于Java的内容:

WebClient client = new WebClient();
client.getOptions().setThrowExceptionOnScriptError(false);

我无法弄清楚如何在Python中实现这一点,任何建议?

1 个答案:

答案 0 :(得分:1)

看起来自定义错误处理程序的实现解决了这个问题,例如:

from selenium import webdriver
from selenium.webdriver.remote.errorhandler import ErrorHandler

class MyHandler(ErrorHandler):
    def check_response(self, response):
        try:
            super(MyHandler, self).check_response(response)
        except Exception as e:
            pass

try:
    browser = webdriver.Remote(desired_capabilities=webdriver.DesiredCapabilities.HTMLUNITWITHJS)
    browser.error_handler = MyHandler()
    browser.get('https://www.ebay.com/')
    browser.close()
    print('success')
except Exception as e:
    print(e)