当我在cygwin中使用Python 3.2.3软件包的以下函数时,它会挂起任何对任何https主机的请求。它将抛出此错误:[Errno 104] 60秒后由对等方重置连接。
更新:我认为它仅限于cygwin,但这也发生在使用Python 3.3的Windows 7 64位中。我现在就试试3.2。使用windows命令shell时的错误是: urlopen错误[WinError 10054]远程主机强行关闭现有连接
UPDATE2(Electric-Bugaloo):这仅限于我正在尝试使用的几个网站。我对谷歌和其他主要网站进行了测试,没有任何问题。它似乎与这个错误有关:
http://bugs.python.org/issue16361
具体来说,服务器在客户端问候后挂起。这是由于python3.2和3.3的编译版本附带的openssl版本。它错误地识别了服务器的ssl版本。现在我需要代码在打开与受影响网站的连接时自动将我的ssl版本降级为sslv3,如下文所示:
How to use urllib2 to get a webpage using SSLv3 encryption
但我无法让它发挥作用。
def worker(url, body=None, bt=None):
'''This function does all the requests to wherever for data
takes in a url, optional body utf-8 encoded please, and optional body type'''
hdrs = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-us,en;q=0.5',
'Accept-Encoding': 'gzip,deflate',
'User-Agent': "My kewl Python tewl!"}
if 'myweirdurl' in url:
hdrs = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-us,en;q=0.5',
'Accept-Encoding': 'gzip,deflate',
'User-Agent': "Netscape 6.0"}
if bt:
hdrs['Content-Type'] = bt
urlopen = urllib.request.urlopen
Request = urllib.request.Request
start_req = time.time()
logger.debug('request start: {}'.format(datetime.now().ctime()))
if 'password' not in url:
logger.debug('request url: {}'.format(url))
req = Request(url, data=body, headers=hdrs)
try:
if body:
logger.debug("body: {}".format(body))
handle = urlopen(req, data=body, timeout=298)
else:
handle = urlopen(req, timeout=298)
except socket.error as se:
logger.error(se)
logger.error(se.errno)
logger.error(type(se))
if hasattr(se, 'errno') == 60:
logger.error("returning: Request Timed Out")
return 'Request Timed Out'
except URLError as ue:
end_time = time.time()
logger.error(ue)
logger.error(hasattr(ue, 'code'))
logger.error(hasattr(ue, 'errno'))
logger.error(hasattr(ue, 'reason'))
if hasattr(ue, 'code'):
logger.warn('The server couldn\'t fulfill the request.')
logger.error('Error code: {}'.format(ue.code))
if ue.code == 404:
return "Resource Not Found (404)"
elif hasattr(ue, 'reason') :
logger.warn('We failed to reach a server with {}'.format(url))
logger.error('Reason: {}'.format(ue.reason))
logger.error(type(ue.reason))
logger.error(ue.reason.errno)
if ue.reason == 'Operation timed out':
logger.error("Arrggghh, timed out!")
else:
logger.error("Why U no match my reason?")
if ue.reason.errno == 60:
return "Operation timed out"
elif hasattr(ue, 'errno'):
logger.warn(ue.reason)
logger.error('Error code: {}'.format(ue.errno))
if ue.errno == 60:
return "Operation timed out"
logger.error("req time: {}".format(end_time - start_req))
logger.error("returning: Server Error")
return "Server Error"
else:
resp_headers = dict(handle.info())
logger.debug('Here are the headers of the page : {}'.format(resp_headers))
logger.debug("The true URL in case of redirects {}".format(handle.geturl()))
try:
ce = resp_headers['Content-Encoding']
except KeyError as ke:
ce = None
else:
logger.debug('Content-Encoding: {}'.format(ce))
try:
ct = resp_headers['Content-Type']
except KeyError as ke:
ct = None
else:
logger.debug('Content-Type: {}'.format(ct))
if ce == "gzip":
logger.debug("Unzipping payload")
bi = BytesIO(handle.read())
gf = GzipFile(fileobj=bi, mode="rb")
if "charset=utf-8" in ct.lower() or ct == 'text/html' or ct == 'text/plain':
payload = gf.read().decode("utf-8")
else:
logger.debug("Unknown content type: {}".format(ct))
sys.exit()
return payload
else:
if ct is not None and "charset=utf-8" in ct.lower() or ct == 'text/html' or ct == 'text/plain':
return handle.read().decode("utf-8")
else:
logger.debug("Unknown content type: {}".format(ct))
sys.exit()
答案 0 :(得分:4)
我弄清楚了,这是在Windows上运行所需的代码块:
'''had to add this windows specific block to handle this bug in urllib2:
http://bugs.python.org/issue11220
'''
if "windows" in platform().lower():
if 'my_wacky_url' or 'my_other_wacky_url' in url.lower():
import ssl
ssl_context = urllib.request.HTTPSHandler(
context=ssl.SSLContext(ssl.PROTOCOL_TLSv1))
opener = urllib.request.build_opener(ssl_context)
urllib.request.install_opener(opener)
#end of urllib workaround
我在第一次尝试之前添加了这个blob:块,它就像一个魅力。感谢您的帮助andrean!