WebSocket握手期间出错:意外的响应代码:使用ChromeDriver和Selenium时出现200

时间:2018-09-17 12:50:47

标签: python selenium selenium-webdriver webdriver selenium-chromedriver

我正在尝试从以下网站中提取数据:https://www.bigschedules.com/ 当我手动执行此操作时,该功能正常运行。

我已经在Python中使用Selenium和Chromedriver开发了一个脚本,以前曾经可以正常工作,但是现在,它显示错误“ WebSocket握手期间错误:意外的响应代码:200

该脚本打开chrome并尝试从网站获取数据,但被卡住,如下图所示: [点击此处查看图片] [1]

[1]:https://i.stack.imgur.com/0JxEi.png enter code here

我正在使用chromedriver版本2.42,Selenium版本3.14

def setupChrome(self):

    # Contains all chrome settings
    self.logger.info("Setting-up Chrome")
    self.settings = webdriver.ChromeOptions()
    #self.settings.add_argument("--incognito")
    self.settings.add_argument('--ignore-ssl-errors')
    self.settings.add_argument('--ignore-certificate-errors')
    self.settings.add_argument('–-disable-web-security')
    self.settings.add_argument('–-allow-running-insecure-content')

def loadBrowser(self):
    self.setupChrome()

    try:
        self.browser = webdriver.Chrome(chrome_options=self.settings,
                                        executable_path="D:\\chromedriver.exe")
        self.browser.maximize_window()

&我在控制台堆栈中遇到以下错误:

webtrends.js:1 **A parser-blocking**, cross site (i.e. different eTLD+1) script, https://sdc.oocl.com/dcsg6upoljf1zldtivsnov48s_8o7d/wtid.js, is invoked via document.write. The network request for this script MAY be blocked by the browser in this or a future page load due to poor network connectivity. If blocked in this page load, it will be confirmed in a subsequent console message. See https://www.chromestatus.com/feature/5718547946799104 for more details.
WebTrends.dcsGetId @ webtrends.js:1
(anonymous) @ VM29:431

6[Intervention] **Slow network is detected**. See <URL> for more details. Fallback font will be used while loading: <URL>
application-c962374717.min.js:4 

pascalprecht.translate.$translateSanitization: **No sanitization** strategy has been configured. This can have serious security implications. See http://angular-translate.github.io/docs/#/guide/19_security for details.
(anonymous) @ application-c962374717.min.js:4
warn @ application-c962374717.min.js:12
c @ angular-translate.min.js:6
sanitize @ angular-translate.min.js:6
a.interpolate @ angular-translate.min.js:6
q.instant @ angular-translate.min.js:6
n @ angular-translate.min.js:6
fn @ VM201:4
e @ angular.js:16658
P.exp @ angular.js:13144
pre @ angular.js:10436
(anonymous) @ angular.js:1385
wa @ angular.js:10545
q @ angular.js:9911
f @ angular.js:9174
q @ angular.js:9928
f @ angular.js:9174
q @ angular.js:9928
(anonymous) @ angular.js:10273
(anonymous) @ angular.js:17051
$digest @ angular.js:18233
$apply @ angular.js:18531
l @ angular.js:12547
s @ angular.js:12785
y.onload @ angular.js:12702
application-c962374717.min.js:4 

Deprecation warning: **moment().add(period, number) is deprecated. Please use moment().add(number, period). See http://momentjs.com/guides/#/warnings/add-inverted-param/ for more info.**
(anonymous) @ application-c962374717.min.js:4
k @ moment-with-locales.min.js:1
T @ moment-with-locales.min.js:1
(anonymous) @ moment-with-locales.min.js:1
(anonymous) @ application-c962374717.min.js:44
invoke @ angular.js:5040
P.instance @ angular.js:11000
q @ angular.js:9865
f @ angular.js:9174
f @ angular.js:9177
f @ angular.js:9177
f @ angular.js:9177
(anonymous) @ angular.js:9039
(anonymous) @ angular.js:9430
d @ angular.js:9217
m @ angular.js:9984
(anonymous) @ angular.js:32398
(anonymous) @ angular.js:1385
(anonymous) @ angular.js:10539
wa @ angular.js:10545
q @ angular.js:9934
(anonymous) @ angular.js:10273
(anonymous) @ angular.js:17051
$digest @ angular.js:18233
$apply @ angular.js:18531
l @ angular.js:12547
s @ angular.js:12785
y.onload @ angular.js:12702
universalModuleDefinition:3 

WebSocket connection to 'wss://www.bigschedules.com/socket.io/?EIO=3&transport=websocket&sid=yywiluhT_bdXDglEAAkc' failed: **Error during WebSocket handshake: Unexpected response code: 200**


n.doOpen @ universalModuleDefinition:3
n.open @ universalModuleDefinition:2
n.probe @ universalModuleDefinition:2
n.onOpen @ universalModuleDefinition:2
n.onHandshake @ universalModuleDefinition:2
n.onPacket @ universalModuleDefinition:2
(anonymous) @ universalModuleDefinition:2
n.emit @ universalModuleDefinition:2
n.onPacket @ universalModuleDefinition:2
r @ universalModuleDefinition:2
(anonymous) @ universalModuleDefinition:2
e.decodePayloadAsBinary @ universalModuleDefinition:2
e.decodePayload @ universalModuleDefinition:2
n.onData @ universalModuleDefinition:2
(anonymous) @ universalModuleDefinition:2
n.emit @ universalModuleDefinition:2
i.onData @ universalModuleDefinition:2
i.onLoad @ universalModuleDefinition:2
hasXDR.r.onreadystatechange @ universalModuleDefinition:2
application-c962374717.min.js:23 Uncaught TypeError: **Cannot assign to read only property 'tagName' of object '#<HTMLDivElement>'**
    at Object.handler.tagNameHandler (application-c962374717.min.js:23)
    at Object.handler.constructInfo (application-c962374717.min.js:23)
    at application-c962374717.min.js:23
handler.tagNameHandler @ application-c962374717.min.js:23
handler.constructInfo @ application-c962374717.min.js:23
(anonymous) @ application-c962374717.min.js:23
4application-c962374717.min.js:23

Uncaught TypeError: **Cannot assign to read only property** 'tagName' of object '#<HTMLInputElement>'
    at Object.handler.tagNameHandler (application-c962374717.min.js:23)
    at Object.handler.constructInfo (application-c962374717.min.js:23)
    at application-c962374717.min.js:23
handler.tagNameHandler @ application-c962374717.min.js:23
handler.constructInfo @ application-c962374717.min.js:23
(anonymous) @ application-c962374717.min.js:23
application-c962374717.min.js:23

Uncaught TypeError: **Cannot assign to read only property** 'tagName' of object '[object HTMLAnchorElement]'
    at Object.handler.tagNameHandler (application-c962374717.min.js:23)
    at Object.handler.constructInfo (application-c962374717.min.js:23)
    at application-c962374717.min.js:23
handler.tagNameHandler @ application-c962374717.min.js:23
handler.constructInfo @ application-c962374717.min.js:23
(anonymous) @ application-c962374717.min.js:23
query:1 **Failed to load resource**: the server responded with a status of 401 (Unauthorized)
application-c962374717.min.js:23 

Uncaught TypeError: **Cannot assign to read only property** 'tagName' of object '[object HTMLAnchorElement]'
    at Object.handler.tagNameHandler (application-c962374717.min.js:23)
    at Object.handler.constructInfo (application-c962374717.min.js:23)
    at tracking (application-c962374717.min.js:23)
    at firstThingAfterSearch (application-c962374717.min.js:23)
    at monitor (application-c962374717.min.js:23)
    at application-c962374717.min.js:23
handler.tagNameHandler @ application-c962374717.min.js:23
handler.constructInfo @ application-c962374717.min.js:23
tracking @ application-c962374717.min.js:23
firstThingAfterSearch @ application-c962374717.min.js:23
monitor @ application-c962374717.min.js:23
(anonymous) @ application-c962374717.min.js:23
setTimeout (async)
(anonymous) @ application-c962374717.min.js:23
wrappedFn @ application-c962374717.min.js:23
angular.js:12759 GET https://www.bigschedules.com/api/routeSearch/query?_=1537193893310&carrier=COSU&carrier=APLU&carrier=MSCU&departureFrom=2018-09-17T00:00:00.000Z&departureTo=2018-09-30T23:59:59.999Z&fndID=P1015&isOriginal=true&porID=P94&requestRefNo=432d9035-b7bb-40d9-b03f-208ffcbdafa3&socketID=yywiluhT_bdXDglEAAkc **401 (Unauthorized)**
(anonymous) @ angular.js:12759
q @ angular.js:12492
(anonymous) @ angular.js:12244
(anonymous) @ angular.js:17051
$digest @ angular.js:18233
(anonymous) @ angular.js:18462
e @ angular.js:6362
(anonymous) @ angular.js:6642
setTimeout (async)
h.defer @ angular.js:6640
$evalAsync @ angular.js:18460
(anonymous) @ angular.js:16923
k @ angular.js:17095
l @ angular.js:17122
c @ angular.js:17131
r @ bluebird.min.js:31
i._settlePromiseFromHandler @ bluebird.min.js:30
i._settlePromise @ bluebird.min.js:30
i._settlePromise0 @ bluebird.min.js:30
i._settlePromises @ bluebird.min.js:30
r._drainQueue @ bluebird.min.js:29
r._drainQueues @ bluebird.min.js:29
drainQueues @ bluebird.min.js:29
Promise.then (async)
r @ bluebird.min.js:30
r._queueTick @ bluebird.min.js:29
s @ bluebird.min.js:29
p.hasDevTools.r.settlePromises @ bluebird.min.js:29
i._fulfill @ bluebird.min.js:30
i._resolveCallback @ bluebird.min.js:30
(anonymous) @ bluebird.min.js:30
Do @ recaptcha__en.js:251
(anonymous) @ recaptcha__en.js:249
T4 @ recaptcha__en.js:71
ta @ recaptcha__en.js:71
Y @ recaptcha__en.js:68
application-c962374717.min.js:23 

Uncaught TypeError: **Cannot assign to read only property** 'tagName' of object '[object HTMLAnchorElement]'
    at Object.handler.tagNameHandler (application-c962374717.min.js:23)
    at Object.handler.constructInfo (application-c962374717.min.js:23)
    at tracking (application-c962374717.min.js:23)
    at firstThingAfterSearch (application-c962374717.min.js:23)
    at monitor (application-c962374717.min.js:23)
    at application-c962374717.min.js:23
handler.tagNameHandler @ application-c962374717.min.js:23
handler.constructInfo @ application-c962374717.min.js:23
tracking @ application-c962374717.min.js:23
firstThingAfterSearch @ application-c962374717.min.js:23
monitor @ application-c962374717.min.js:23
(anonymous) @ application-c962374717.min.js:23
setTimeout (async)
(anonymous) @ application-c962374717.min.js:23
wrappedFn @ application-c962374717.min.js:23
angular.js:12759 

GET https://www.bigschedules.com/api/routeSearch/query?_=1537193947261&carrier=COSU&carrier=APLU&carrier=MSCU&departureFrom=2018-09-17T00:00:00.000Z&departureTo=2018-09-30T23:59:59.999Z&fndID=P156&isOriginal=true&porID=P94&requestRefNo=ba8fbb09-d98a-4b44-96e0-040511775c80&socketID=yywiluhT_bdXDglEAAkc **401 (Unauthorized)**

2 个答案:

答案 0 :(得分:0)

您可以在Pyhton上尝试 urllib2 BeautifulSoup 。 下面的代码示例向您展示如何从页面源获取页面元素的属性。

from BeautifulSoup import BeautifulSoup as BeautifulSoup
import urllib2

page = urllib2.urlopen('yourUrl')
soup = BeautifulSoup(page)
elementsYouWantToExtract = soup.findAll('element tag for instance: "img" ')
for attributeYouWantToExtract in elementsYouWantToExtract:
    print elementsYouWantToSearch['attributeYouWantToExtract']

希望这会有所帮助...

答案 1 :(得分:0)

根据您的代码试用,您是否已调用 url https://www.bigschedules.com/tou并不明显。但是根据您的错误堆栈跟踪,您的主要问题是:

WebSocket connection to 'wss://www.bigschedules.com/socket.io/?EIO=3&transport=websocket&sid=yywiluhT_bdXDglEAAkc' failed: Error during WebSocket handshake: Unexpected response code: 200

错误背后可能有很多原因,如下所示:

  • 可能的原因是,此脚本的网络请求在此页面加载中或将来可能由于网络连接不良而被浏览器阻止。
  • 根据Intervention: Blocking the load of cross-origin, parser-blocking scripts inserted via document.write for users on 2G
    •   

      对于连接速度较慢的用户(例如2G),通过document.write加载的第三方脚本的性能损失通常非常严重,以至于导致主页内容的显示延迟了数十秒钟。如果2G连接上的用户未命中HTTP缓存,此功能将阻止通过document.write插入的跨域,解析器阻止脚本的加载。该功能仅适用于主机中的此类脚本。

  • 另一个原因可能是,加载时已检测到缓慢的网络并使用了 Fallback字体,该字体已配置为不进行消毒策略,并且对安全性有严重影响。因此,您面对:

    response code: 200
    

解决方案

  • 使用 Selenium Client WebDriver Web浏览器的所有最新二进制文件保持您的测试环境更新。 >变体。
  • 使用更快的网络(例如 3G 4G
  • )配置测试环境