我要做的是读取一行(一个IP地址),用该地址打开网站,然后重复文件中的所有地址。相反,我得到一个错误。我是python的新手,所以也许这是一个简单的错误。在此先感谢!!!
CODE:
>>> f = open("proxy.txt","r"); #file containing list of ip addresses
>>> address = (f.readline()).strip(); # to remove \n at end of line
>>>
>>> while line:
proxy = urllib2.ProxyHandler({'http': address })
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')
address = (f.readline()).strip();
ERROR:
Traceback (most recent call last):
File "<pyshell#15>", line 5, in <module>
urllib2.urlopen('http://www.google.com')
File "D:\Programming\Python\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "D:\Programming\Python\lib\urllib2.py", line 394, in open
response = self._open(req, data)
File "D:\Programming\Python\lib\urllib2.py", line 412, in _open
'_open', req)
File "D:\Programming\Python\lib\urllib2.py", line 372, in _call_chain
result = func(*args)
File "D:\Programming\Python\lib\urllib2.py", line 1199, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "D:\Programming\Python\lib\urllib2.py", line 1174, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
答案 0 :(得分:1)
这意味着代理不可用。
这是一个代理检查器,它可以同时检查几个代理:
#!/usr/bin/env python
import fileinput # accept proxies from files or stdin
try:
from gevent.pool import Pool # $ pip install gevent
import gevent.monkey; gevent.monkey.patch_all() # patch stdlib
except ImportError: # fallback on using threads
from multiprocessing.dummy import Pool
try:
from urllib2 import ProxyHandler, build_opener
except ImportError: # Python 3
from urllib.request import ProxyHandler, build_opener
def is_proxy_alive(proxy, timeout=5):
opener = build_opener(ProxyHandler({'http': proxy})) # test redir. and such
try: # send request, read response headers, close connection
opener.open("http://example.com", timeout=timeout).close()
except EnvironmentError:
return None
else:
return proxy
candidate_proxies = (line.strip() for line in fileinput.input())
pool = Pool(20) # use 20 concurrent connections
for proxy in pool.imap_unordered(is_proxy_alive, candidate_proxies):
if proxy is not None:
print(proxy)
用法:
$ python alive-proxies.py proxy.txt
$ echo user:password@ip:port | python alive-proxies.py