我正在尝试获取托管网络中托管的网页。我正在使用以下代码:
import requests
def get_tor_session():
session = requests.session()
session.proxies = {'http': 'socks5://127.0.0.1:9150',
'https': 'socks5://127.0.0.1:9150'}
return session
session = get_tor_session()
当我尝试使用普通网站时,它可以正常工作,例如:print(session.get("http://httpbin.org/ip").text)
打印{"origin": "80.67.172.162"}
但是当我在.onion网站上试用它时,它会因此错误而失败:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/socks.py", line 813, in connect
negotiate(self, dest_addr, dest_port)
File "/usr/local/lib/python3.6/site-packages/socks.py", line 477, in _negotiate_SOCKS5
CONNECT, dest_addr)
File "/usr/local/lib/python3.6/site-packages/socks.py", line 540, in _SOCKS5_request
resolved = self._write_SOCKS5_address(dst, writer)
File "/usr/local/lib/python3.6/site-packages/socks.py", line 592, in _write_SOCKS5_address
addresses = socket.getaddrinfo(host, port, socket.AF_UNSPEC, socket.SOCK_STREAM, socket.IPPROTO_TCP, socket.AI_ADDRCONFIG)
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
During handling of the above exception, another exception occurred:
...
Traceback (most recent call last):
File "spider.py", line 13, in <module>
print(session.get("http://zqktlwi4fecvo6ri.onion/").text)
File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 508, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: SOCKSHTTPConnectionPool(host='zqktlwi4fecvo6ri.onion', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.contri
b.socks.SOCKSConnection object at 0x106fd62e8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
答案 0 :(得分:2)
使用socks5
方案时,域由客户端的DNS服务器在本地解析。但“普通”DNS服务器无法解析.onion域,因此您的请求失败。
使用方案
socks5
会导致DNS解析在客户端上发生,而不是在代理服务器上发生。这与curl一致,curl使用该方案来决定是在客户端还是代理上执行DNS解析。如果要解析代理服务器上的域,请使用socks5h
作为方案。
因此,为了连接到.onion站点,您应该让TOR解析域。如果您在代理词典中使用socks5h
sheme,则可以这样做。
import requests
session = requests.session()
session.proxies = {'http': 'socks5h://127.0.0.1:9150', 'https': 'socks5h://127.0.0.1:9150'}
response = session.get("https://3g2upl4pq6kufc4m.onion/")
print(response)
#<Response [200]>
请注意,您可能必须安装额外的依赖项。
pip install requests[socks]