我看到很多例子,偶尔(比如100个中的2个)这个有效,但不是大部分时间,我不明白为什么?任何想法非常感谢!
我对使用代理并不熟悉,我怀疑数据只是没有传回那些没有返回错误的有效代理,只是似乎返回一个空页但不确定如何进一步测试
规格为centOS 7,selenium 3.6.0,phantomjs 2.1.1
import os, requests
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
os.environ["PATH"] += os.pathsep + '/path/to/executable'
dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = ( "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" )
url = 'https://httpbin.org/ip'
proxy = 'xxx.xxx.xxx.xxx:xxxx'
# requests indicates that the proxy is valid 99% of the time
response = requests.get(url, proxies={"http": proxy, "https": proxy})
print response.json()
service_args = [
'--ignore-ssl-errors=true',
'--proxy=' + proxy,
'--proxy-type=http',
'--ssl-protocol=any'
]
# 98% of the time this outputs u'<html><head></head><body></body></html>'
browser = webdriver.PhantomJS(desired_capabilities=dcap, service_args=service_args)
browser.get(url)
print browser.page_source