我试图使用python包proxybroker 我试着使用这里提到的一个例子。我刚刚复制了以下示例以在本地运行:
import asyncio from proxybroker import Broker async def save(proxies, filename): """Save proxies to a file.""" with open(filename, 'w') as f: while True: proxy = await proxies.get() if proxy is None: break proto = 'https' if 'HTTPS' in proxy.types else 'http' row = '%s://%s:%d\n' % (proto, proxy.host, proxy.port) f.write(row) def main(): proxies = asyncio.Queue() broker = Broker(proxies) tasks = asyncio.gather(broker.find(types=['HTTP', 'HTTPS'], limit=10), save(proxies, filename='proxies.txt')) loop = asyncio.get_event_loop() loop.run_until_complete(tasks) if __name__ == '__main__': main()
当我尝试运行代码时,会抛出以下错误以及一些弃用警告:
/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) /home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) /home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) /home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) /home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) /home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) /home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) /home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) /home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) /home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/aiohttp/client.py:494: DeprecationWarning: Use async with instead warnings.warn("Use async with instead", DeprecationWarning) https://getproxy.net/en/ is failed. Error: ClientOSError(101, 'Cannot connect to host getproxy.net:443 ssl:True [Can not connect to getproxy.net:443 [Network is unreachable]]'); https://getproxy.net/en/ is failed. Error: ClientOSError(101, 'Cannot connect to host getproxy.net:443 ssl:True [Can not connect to getproxy.net:443 [Network is unreachable]]'); https://getproxy.net/en/ is failed. Error: ClientOSError(101, 'Cannot connect to host getproxy.net:443 ssl:True [Can not connect to getproxy.net:443 [Network is unreachable]]'); Traceback (most recent call last): File "/home/sebastian/PycharmProjects/testing/test/test_prox.py", line 27, in main() File "/home/sebastian/PycharmProjects/testing/test/test_prox.py", line 23, in main loop.run_until_complete(tasks) File "/usr/lib/python3.5/asyncio/base_events.py", line 387, in run_until_complete return future.result() File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result raise self._exception File "/usr/lib/python3.5/asyncio/tasks.py", line 241, in _step result = coro.throw(exc) File "/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/proxybroker/api.py", line 108, in find await self._run(self._checker.check_judges(), action) File "/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/proxybroker/api.py", line 114, in _run await tasks File "/usr/lib/python3.5/asyncio/futures.py", line 361, in __iter__ yield self # This tells Task to wait for completion. File "/usr/lib/python3.5/asyncio/tasks.py", line 296, in _wakeup future.result() File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result raise self._exception File "/usr/lib/python3.5/asyncio/tasks.py", line 241, in _step result = coro.throw(exc) File "/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/proxybroker/checker.py", line 26, in check_judges await asyncio.gather(*[j.check() for j in self._judges]) File "/usr/lib/python3.5/asyncio/futures.py", line 361, in __iter__ yield self # This tells Task to wait for completion. File "/usr/lib/python3.5/asyncio/tasks.py", line 296, in _wakeup future.result() File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result raise self._exception File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step result = coro.send(None) File "/home/sebastian/PycharmProjects/STW/venv/lib/python3.5/site-packages/proxybroker/judge.py", line 82, in check j=self, code=resp.status, page=page[0], IndexError: string index out of range
我使用python3.5.2和最新版本的proxybroker(0.1.4)aiohttp(1.0.2)asyncio(3.4.3)。
我不确定导致错误的原因是因为我没有更改代码示例,据我所知,我已经安装了所有依赖项。任何人都可以帮助我,告诉我我做错了什么,甚至更好地做到了这一点?
修改
此问题的快速解决方法是更改发生错误的行。该行仅用于记录错误,因此更改不应造成任何伤害。
对于这种解决方法 - 不是解决方案 - 我在第79行的judge.py中添加了一个额外的检查,之前引发了异常。
我在当地将其更改为:
if isinstance(page, type(list())) or isinstance(page, type(dict())):
log.error(('{j} is failed. HTTP status code: {code}; '
'Real IP on page: {ip}; Version: {word}; '
'Response: {page}').format(
j=self, code=resp.status, page=page[0],
ip=(get_my_ip() in page), word=(rv in page)))
else:
log.error(('{j} is failed. HTTP status code: {code}; '
'Real IP on page: {ip}; Version: {word}; '
'Response: {page}').format(
j=self, code=resp.status, page=page,
ip=(get_my_ip() in page), word=(rv in page)))
这样我可以再次使用proxybroker。该问题已在proxybroker的gihub上提交。
答案 0 :(得分:0)
弃用警告是无害的(至少除非我删除这种向后兼容性)。
错误只是说getproxy.net
不可用 - 这是您的主要问题。
答案 1 :(得分:0)
目前存在以下问题:
<Judge [HTTP] www.ingosander.net> is failed. HTTP status code: 302; Real IP on page: False; Version: False; Response:
响应内容为空,因为它已重定向(状态代码:302)。
可能的解决方案: 1.更改http客户端。使用请求pakage - 自动跟随重定向。 2.更改重定向网址添加
的错误日志记录elif (resp.status == 302):
log.error(('{j} is failed. HTTP status code: {code}; '
'Real IP on page: {ip}; Version: {word}; '
'Response: {page}').format(
j=self, code=resp.status, page=None,
ip=(get_my_ip() in page), word=(rv in page)))
在judge.py
中