在Python中使用ProxyHandler和Selenium时遇到麻烦

时间:2014-03-08 02:24:45

标签: python python-2.7 selenium urllib2

我正在尝试在Selenium中使用代理IP地址进行网页抓取。我在Mac OSX 10.7.5上运行Python 2.7.3,我有以下python代码

 import urllib2
 from selenium import webdriver

 fileproxylist = open('proxylist.txt', 'r')
 proxyList = fileproxylist.readlines()
 indexproxy = 0
 totalproxy = len(proxyList)


 def get_source_html_proxy(url, proxip):

     proxyip=urllib2.ProxyHandler({'http':proxip})
     opener = urllib2.build_opener(proxyip)
     urllib2.install_opener(opener)
     req=urllib2.Request(url)
     sock=urllib2.urlopen(req)
     data = sock.read()
     return data

 browser = webdriver.Chrome()
 browser.get(get_source_html_proxy(MyUrl,proxyList[0]))

其中MyUrl是我想要废弃的地址的网址,而proxlist[0]是我想要抓取的IP地址,而不是我本地计算机的IP地址。当我运行此代码时,我收到以下错误:

 Traceback (most recent call last):
    File "Scrape.py", line 89, in <module>
         browser.get(get_source_html_proxy(MyUrl,proxyList[0]))
    File "Scrape.py", line 83, in get_source_html_proxy
         sock=urllib2.urlopen(req)
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 126,          
    in urlopen
         return _opener.open(url, data, timeout)
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 400, 
    in open
         response = self._open(req, data)
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 418, 
    in _open
         '_open', req)
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 378, 
    in _call_chain
         result = func(*args)
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1207, 
    in http_open
         return self.do_open(httplib.HTTPConnection, req)
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1177, 
    in do_open
         raise URLError(err)

    urllib2.URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>

我不确定这里的问题是什么。有人可以帮我弄清楚发生了什么吗?谢谢!

0 个答案:

没有答案