python tor stem HTTP错误503

时间:2014-11-12 11:52:30

标签: python-3.x

我目前正试图通过python获取新的ip。

显示来源:

import urllib.request
from stem import Signal
from stem.control import Controller
import socks, socket, time, random


proxy_support = urllib.request.ProxyHandler({"http" : "127.0.0.1:8118"})
opener = urllib.request.build_opener(proxy_support)

UA = [
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.73.11 (KHTML, like Gecko) Version/7.0.1 Safari/537.73.11',
        'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.76 Safari/537.36',
        'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0',
        'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36',
        'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36',
        'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:26.0) Gecko/20100101 Firefox/26.0',
        'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0',
        'Mozilla/5.0 (Windows NT 6.1; rv:26.0) Gecko/20100101 Firefox/26.0',
        'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.76 Safari/537.36'
    ]

def newI():
    controller = Controller.from_port(port = 9051)
    try:
        controller.authenticate()
        controller.signal(Signal.NEWNYM)
        bytes_read = controller.get_info("traffic/read")
        bytes_written = controller.get_info("traffic/written")

        print (bytes_read)
        print (bytes_written)
    finally:
        controller.close()

if __name__ == '__main__':
    params = 'site:google.com admin'
    page = 0
    for i in range(100):
        url = 'http://www.google.co.kr/search?hl=ko&q=%s&start=%d' %(urllib.parse.quote(params), page)
        proxy_support = urllib.request.ProxyHandler({"http" : "127.0.0.1:8118"})
        urllib.request.install_opener(opener)
        user_agent = random.choice(UA)
        headers = {'User-Agent' : user_agent}
        random_interval = random.randrange(1, 5, 1)
        time.sleep(random_interval)
        req = urllib.request.Request(url, headers = headers)
        res = urllib.request.urlopen(req)
        html = res.read()
        print (len(html))
        page = page + 10
        newI()

我的vidalia跑步和私密。我正确设置了我的设置: Web代理(HTTP):127.0.0.1:8118和HTTPS相同 在我的privoxy配置文件中,我有这一行:

forward-socks5   /               127.0.0.1:9050 .

虽然我仍然在运行代码时仍停留在案例1上但我无法获得IP。这是我的vidalia的日志:

1. settings > Sharing > Run as client only
2. settings > Advanced > 127.0.0.1 : 9051

虽然我仍然在运行代码时仍停留在案例1上但我无法获得IP。这是我的vidalia的日志:

Traceback (most recent call last):
  File "C:/Users/kwon/PycharmProjects/google_search/test.py", line 50, in <module>
    res = urllib.request.urlopen(req)
  File "C:\Python33\lib\urllib\request.py", line 156, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python33\lib\urllib\request.py", line 475, in open
    response = meth(req, response)
  File "C:\Python33\lib\urllib\request.py", line 587, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python33\lib\urllib\request.py", line 507, in error
    result = self._call_chain(*args)
  File "C:\Python33\lib\urllib\request.py", line 447, in _call_chain
    result = func(*args)
  File "C:\Python33\lib\urllib\request.py", line 692, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "C:\Python33\lib\urllib\request.py", line 475, in open
    response = meth(req, response)
  File "C:\Python33\lib\urllib\request.py", line 587, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python33\lib\urllib\request.py", line 513, in error
    return self._call_chain(*args)
  File "C:\Python33\lib\urllib\request.py", line 447, in _call_chain
    result = func(*args)
  File "C:\Python33\lib\urllib\request.py", line 595, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 503: Service Unavailable

我做错了什么?

1 个答案:

答案 0 :(得分:0)

Google会阻止自动请求,请查看另一篇帖子Tor blocked by Google