我运行由龙卷风写的蜘蛛,如https://github.com/tornadoweb/tornado/blob/master/demos/webspider/webspider.py,of当然,将httpclient.AsyncHTTPClient更改为curl_httpclient.CurlAsyncHTTPClient
method: HEAD
蜘蛛在Windows 10.python3 +,64上运行。
很遗憾错误来了:
httpclient.AsyncHTTPClient.configure('tornado.curl_httpclient.CurlAsyncHTTPClient')
有人看过这个吗?我在google搜索过它,但是对于蜘蛛龙卷风的人来说还不够,我没有找到答案?
或者任何人都可以告诉我有关错误的信息?
答案 0 :(得分:1)
尝试覆盖curl_httpclient.CurlAsyncHTTPClient中的方法
curl_log = logging.getLogger('tornado.curl_httpclient')
class PersonAsyncHTTPClient(curl_httpclient.CurlAsyncHTTPClient):
def _curl_create(self):
curl = pycurl.Curl()
curl.setopt(pycurl.CAINFO, certifi.where()) # the soure had no this line.missing this line would come ssl error.
if curl_log.isEnabledFor(logging.DEBUG):
curl.setopt(pycurl.VERBOSE, 1)
curl.setopt(pycurl.DEBUGFUNCTION, self._curl_debug)
if hasattr(pycurl, 'PROTOCOLS'): # PROTOCOLS first appeared in pycurl 7.19.5 (2014-07-12)
curl.setopt(pycurl.PROTOCOLS, pycurl.PROTO_HTTP | pycurl.PROTO_HTTPS)
curl.setopt(pycurl.REDIR_PROTOCOLS, pycurl.PROTO_HTTP | pycurl.PROTO_HTTPS)
return curl
当我们使用时:
http_client = PersonAsyncHTTPClient()
req = httpclient.HTTPRequest(url='https://www.google.com.hk/', proxy_host='', proxy_port=1234)
一切都成功了吗?
这就是〜