Proxybroker IndexError

时间:2016-09-30 16:35:00

标签: python asynchronous python-3.5 python-asyncio aiohttp

我试图使用python包proxybroker 我试着使用这里提到的一个例子。我刚刚复制了以下示例以在本地运行:

import asyncio from proxybroker import Broker


async def save(proxies, filename):
    """Save proxies to a file."""
    with open(filename, 'w') as f:
        while True:
            proxy = await proxies.get()
            if proxy is None:
                break
            proto = 'https' if 'HTTPS' in proxy.types else 'http'
            row = '%s://%s:%d\n' % (proto, proxy.host, proxy.port)
            f.write(row)


def main():
    proxies = asyncio.Queue()
    broker = Broker(proxies)
    tasks = asyncio.gather(broker.find(types=['HTTP', 'HTTPS'], limit=10),
                           save(proxies, filename='proxies.txt'))
    loop = asyncio.get_event_loop()
    loop.run_until_complete(tasks)


if __name__ == '__main__':
    main()

当我尝试运行代码时,会抛出以下错误以及一些弃用警告:

/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning)
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning)
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning)
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning)
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning)
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning)
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning)
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning)
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning)
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494:
DeprecationWarning: Use async with instead   warnings.warn("Use async
with instead", DeprecationWarning) https://getproxy.net/en/ is failed.
Error: ClientOSError(101, 'Cannot connect to host getproxy.net:443
ssl:True [Can not connect to getproxy.net:443 [Network is
unreachable]]'); https://getproxy.net/en/ is failed. Error:
ClientOSError(101, 'Cannot connect to host getproxy.net:443 ssl:True
[Can not connect to getproxy.net:443 [Network is unreachable]]');
https://getproxy.net/en/ is failed. Error: ClientOSError(101, 'Cannot
connect to host getproxy.net:443 ssl:True [Can not connect to
getproxy.net:443 [Network is unreachable]]'); Traceback (most recent
call last):   File
"/home/sebastian/PycharmProjects/testing/test/test_prox.py", line 27,
in 
    main()   File "/home/sebastian/PycharmProjects/testing/test/test_prox.py", line 23,
in main
    loop.run_until_complete(tasks)   File "/usr/lib/python3.5/asyncio/base_events.py", line 387, in
run_until_complete
    return future.result()   File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
    raise self._exception   File "/usr/lib/python3.5/asyncio/tasks.py", line 241, in _step
    result = coro.throw(exc)   File "/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/proxybroker/api.py",
line 108, in find
    await self._run(self._checker.check_judges(), action)   File "/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/proxybroker/api.py",
line 114, in _run
    await tasks   File "/usr/lib/python3.5/asyncio/futures.py", line 361, in __iter__
    yield self  # This tells Task to wait for completion.   File "/usr/lib/python3.5/asyncio/tasks.py", line 296, in _wakeup
    future.result()   File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
    raise self._exception   File "/usr/lib/python3.5/asyncio/tasks.py", line 241, in _step
    result = coro.throw(exc)   File "/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/proxybroker/checker.py",
line 26, in check_judges
    await asyncio.gather(*[j.check() for j in self._judges])   File "/usr/lib/python3.5/asyncio/futures.py", line 361, in __iter__
    yield self  # This tells Task to wait for completion.   File "/usr/lib/python3.5/asyncio/tasks.py", line 296, in _wakeup
    future.result()   File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
    raise self._exception   File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
    result = coro.send(None)   File "/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/proxybroker/judge.py",
line 82, in check
    j=self, code=resp.status, page=page[0], IndexError: string index out of range

我使用python3.5.2和最新版本的proxybroker(0.1.4)aiohttp(1.0.2)asyncio(3.4.3)。

我不确定导致错误的原因是因为我没有更改代码示例,据我所知,我已经安装了所有依赖项。任何人都可以帮助我,告诉我我做错了什么,甚至更好地做到了这一点?

修改
此问题的快速解决方法是更改​​发生错误的行。该行仅用于记录错误,因此更改不应造成任何伤害。 对于这种解决方法 - 不是解决方案 - 我在第79行的judge.py中添加了一个额外的检查,之前引发了异常。 我在当地将其更改为:

        if isinstance(page, type(list())) or isinstance(page, type(dict())):
            log.error(('{j} is failed. HTTP status code: {code}; '
                       'Real IP on page: {ip}; Version: {word}; '
                       'Response: {page}').format(
                j=self, code=resp.status, page=page[0],
                ip=(get_my_ip() in page), word=(rv in page)))
        else:
            log.error(('{j} is failed. HTTP status code: {code}; '
                       'Real IP on page: {ip}; Version: {word}; '
                       'Response: {page}').format(
                      j=self, code=resp.status, page=page,
                      ip=(get_my_ip() in page), word=(rv in page)))

这样我可以再次使用proxybroker。该问题已在proxybroker的gihub上提交。

2 个答案:

答案 0 :(得分:0)

弃用警告是无害的(至少除非我删除这种向后兼容性)。

错误只是说getproxy.net不可用 - 这是您的主要问题。

答案 1 :(得分:0)

目前存在以下问题:

<Judge [HTTP] www.ingosander.net> is failed. HTTP status code: 302; Real IP on page: False; Version: False; Response:

响应内容为空,因为它已重定向(状态代码:302)。

可能的解决方案: 1.更改http客户端。使用请求pakage - 自动跟随重定向。 2.更改重定向网址添加

的错误日志记录
elif (resp.status == 302):
    log.error(('{j} is failed. HTTP status code: {code}; '
               'Real IP on page: {ip}; Version: {word}; '
               'Response: {page}').format(
              j=self, code=resp.status, page=None,
              ip=(get_my_ip() in page), word=(rv in page)))

在judge.py

  1. 在jud.py
  2. 底部的judgeList中评论此法官