在代理服务器后面运行selenium

时间:2013-08-01 08:28:28

标签: python selenium selenium-webdriver proxy web-scraping

我一直在使用selenium在python中进行自动浏览器模拟和网页抓取,它对我来说效果很好。但是现在,我必须在代理服务器后面运行它。因此,现在selenium打开窗口但由于未在打开的浏览器上设置代理设置而无法打开请求的页面。目前的代码如下(样本):

from selenium import webdriver

sel = webdriver.Firefox()
sel.get('http://www.google.com')
sel.title
sel.quit()

如何更改上述代码以便现在使用代理服务器?

4 个答案:

答案 0 :(得分:22)

您需要设置所需的功能或浏览器配置文件,如下所示:

profile = webdriver.FirefoxProfile()
profile.set_preference("network.proxy.type", 1)
profile.set_preference("network.proxy.http", "proxy.server.address")
profile.set_preference("network.proxy.http_port", "port_number")
profile.update_preferences()
driver = webdriver.Firefox(firefox_profile=profile)

另见相关主题:

答案 1 :(得分:8)

官方Selenium文档(http://docs.seleniumhq.org/docs/04_webdriver_advanced.jsp#using-a-proxy)提供了有关使用代理的明确且有用的指南。 对于Firefox(您的示例代码中的首选浏览器),您应该执行以下操作:

from selenium import webdriver
from selenium.webdriver.common.proxy import *

myProxy = "host:8080"

proxy = Proxy({
    'proxyType': ProxyType.MANUAL,
    'httpProxy': myProxy,
    'ftpProxy': myProxy,
    'sslProxy': myProxy,
    'noProxy': '' # set this value as desired
    })

driver = webdriver.Firefox(proxy=proxy)

答案 2 :(得分:3)

这将完成这项工作:

import selenium
from selenium.webdriver.common.proxy import *

proxyHost = "my.proxy.host or IP"
proxyPort = "55555"

fp = webdriver.FirefoxProfile()
fp.set_preference("network.proxy.type", 1)
#fp.set_preference("network.proxy.http", proxyHost) #HTTP PROXY
#fp.set_preference("network.proxy.http_port", int(proxyPort))
#fp.set_preference("network.proxy.ssl", proxyHost) #SSL  PROXY
#fp.set_preference("network.proxy.ssl_port", int(proxyPort))
fp.set_preference('network.proxy.socks', proxyHost) #SOCKS PROXY
fp.set_preference('network.proxy.socks_port', int(proxyPort))
fp.update_preferences()

driver = webdriver.Firefox(firefox_profile=fp)

driver.get("http://www.whatismyip.com/")

答案 3 :(得分:0)

def install_proxy(PROXY_HOST,PROXY_PORT):
    fp = webdriver.FirefoxProfile()
    print PROXY_PORT
    print PROXY_HOST
    fp.set_preference("network.proxy.type", 1)
    fp.set_preference("network.proxy.http",PROXY_HOST)
    fp.set_preference("network.proxy.http_port",int(PROXY_PORT))
    fp.set_preference("network.proxy.https",PROXY_HOST)
    fp.set_preference("network.proxy.https_port",int(PROXY_PORT))
    fp.set_preference("network.proxy.ssl",PROXY_HOST)
    fp.set_preference("network.proxy.ssl_port",int(PROXY_PORT))  
    fp.set_preference("network.proxy.ftp",PROXY_HOST)
    fp.set_preference("network.proxy.ftp_port",int(PROXY_PORT))   
    fp.set_preference("network.proxy.socks",PROXY_HOST)
    fp.set_preference("network.proxy.socks_port",int(PROXY_PORT))   
    fp.set_preference("general.useragent.override","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A")
    fp.update_preferences()
    return webdriver.Firefox(firefox_profile=fp)