PhantomJS不再加载GooglePlus页面

时间:2017-02-04 20:48:18

标签: python selenium-webdriver phantomjs http-status-code-404 google-plus

我使用phantomjs 2.1.1(在Ubuntu Server 16.04.1和Mac OS X 10.12.2上)python selenium webdriver

PhantomJS似乎无法在几天内加载googleplus页面。它会加载404错误页面。尝试使用Firefox jeckodriver加载同一页面,它会加载正确的页面;也粘贴了Safari,Firefox或Chrome上的网址。

googleplus和PhantomJS之间有什么问题?

示例代码:

#!/usr/bin/env python

from selenium import webdriver
import time

driver = webdriver.PhantomJS()

WORD = "rock"
driver.get("https://plus.google.com/s/%s/top" % WORD)
time.sleep(7)

F = open('googleplus-test-search.html','w')
F.write( driver.page_source.encode('utf-8') )
F.close()

driver.quit()
exit(0)

已加载页面:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="initial-scale=1, minimum-scale=1, width=device-width">
<title>
Error 404 (Non trovato)!!1</title>
<style>
*{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{color:#222;text-align:unset;margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px;}* >
 body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}pre{white-space:pre-wrap;}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}</style>
</head>
<body>
<div id="af-error-container">
<a href="//www.google.com/">
<span id="logo" aria-label="Google">
</span>
</a>
<p>
<b>
404.</b>
 <ins>
Errore.</ins>
</p>
<p>
Impossibile trovare l'URL richiesto su questo server. <ins>
Nessun'altra informazione disponibile.</ins>
</p>
</div>
</body>
</html>

1 个答案:

答案 0 :(得分:0)

我修复了为PhantomJS设置自定义userAgent

from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = (
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 9_1_3) AppleWebKit/602.3.12 "
        "(KHTML, like Gecko) Version/10.0.2 Safari/602.3.12"
)
browser = webdriver.PhantomJS(desired_capabilities=dcap)