Python中的Selenium PhantomJS自定义标头

时间:2016-02-27 05:25:15

标签: python selenium phantomjs custom-headers

我想添加"自定义标题"到python中的Selenium PhantomJS。 这些是我想添加的标题。

headers = { 'Accept':'*/*',
            'Accept-Encoding':'gzip, deflate, sdch',
            'Accept-Language':'en-US,en;q=0.8',
            'Cache-Control':'max-age=0',
            'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36'
          }

这是我正在使用的代码:

from selenium import webdriver

service_args = [
    '--proxy=127.0.0.1:9999',
    '--proxy-type=socks5',
    ]
driver = webdriver.PhantomJS(service_args=service_args)


driver.set_window_size(1120, 550)
driver.get("https://duckduckgo.com/")
driver.find_element_by_id('search_form_input_homepage').send_keys("realpython")
driver.find_element_by_id("search_button_homepage").click()
print driver.current_url
driver.quit()

如何修改包含这些自定义标头的代码?

请帮忙。

4 个答案:

答案 0 :(得分:15)

Andriy Ivaneyko的方法对我不起作用(PhantomJS 2.1.1和Selenium 2.48.0)。

我写了一个完整的例子来设置Selenium PhantomJS中的所有标题,窗口大小和代理:

leftConstraint.enabled = false

注1:

from selenium import webdriver def init_phantomjs_driver(*args, **kwargs): headers = { 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language':'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3', 'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0', 'Connection': 'keep-alive' } for key, value in headers.iteritems(): webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.customHeaders.{}'.format(key)] = value webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.settings.userAgent'] = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36' driver = webdriver.PhantomJS(*args, **kwargs) driver.set_window_size(1400,1000) return driver def main(): service_args = [ '--proxy=127.0.0.1:9999', '--proxy-type=http', '--ignore-ssl-errors=true' ] driver = init_phantomjs_driver(service_args=service_args) driver.get('http://cn.bing.com') 设置在userAgent而不是phantomjs.page.settings.userAgent

注2:

Andriy Ivaneyko使用phantomjs.page.customHeaders构建enumerate,关键是循环索引,因此数据变为:

DesiredCapabilities.PHANTOMJS

没有正确设置标头属性。

答案 1 :(得分:13)

以下一种方式设置标题:

from selenium import webdriver


headers = { 'Accept':'*/*',
    'Accept-Encoding':'gzip, deflate, sdch',
    'Accept-Language':'en-US,en;q=0.8',
    'Cache-Control':'max-age=0',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36'
}

for key, value in enumerate(headers):
    capability_key = 'phantomjs.page.customHeaders.{}'.format(key)
    webdriver.DesiredCapabilities.PHANTOMJS[capability_key] = value

然后开始使用您的驱动程序:

service_args = [
    '--proxy=127.0.0.1:9999',
    '--proxy-type=socks5',
]
driver = webdriver.PhantomJS(service_args=service_args)
# ............... 

答案 2 :(得分:1)

from selenium import webdriver

headers = { 'Accept':'*/*',
    'Accept-Encoding':'gzip, deflate, sdch',
    'Accept-Language':'en-US,en;q=0.8',
    'Cache-Control':'max-age=0',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36' }

for key in headers:
    webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.customHeaders.{}'.format(key)] = headers[key]

答案 3 :(得分:0)

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = (
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/53 "
"(KHTML, like Gecko) Chrome/15.0.87")

driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.get("http://www.google.com")