python请求会话在读取大(超过50mb)响应内容后无法读取响应

时间:2014-08-12 03:56:19

标签: python session

当使用python请求访问某些rest api时,我正在使用请求的会话对象。我遇到了一个问题,当第一个请求正在读取大内容(超过50mb)时,后续的http请求在同一个会话对象上失败。但如果我没有使用Session对象,那么一切正常......我已经解释了下面的代码......

import requests       # version 2.3.0  # python version 2.7

headers = {"Authorization":"Bearer sometoken"}

sess = requests.Session()
sess.verify = False
host = "https://somehost/endpoint/"
res = sess.get(url = host+'obj1/28/content', headers = headers)
print res  # this result received successfully with 200 response status code

url = host + 'obj2/1/content'
res = sess.get(url = url, headers=headers)  # the process running here continuously running     here. I need to kill the process to exit.
print "content ", res.content # this line never gets executed...

杀死进程后,堆栈跟踪......

  File "/opt/lib/python2.7/site-packages/requests/sessions.py", line 556, in send
    r = adapter.send(request, **kwargs)
  File "/opt/lib/python2.7/site-packages/requests/adapters.py", line 391, in send
    r.content
  File "/opt/lib/python2.7/site-packages/requests/models.py", line 690, in content
    self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
  File "/opt/lib/python2.7/site-packages/requests/models.py", line 628, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/opt/lib/python2.7/site-packages/requests/packages/urllib3/response.py", line 240, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/opt/lib/python2.7/site-packages/requests/packages/urllib3/response.py", line 187, in read
    data = self._fp.read(amt)
  File "/opt/lib/python2.7/httplib.py", line 567, in read
    s = self.fp.read(amt)
  File "/opt/lib/python2.7/httplib.py", line 1313, in read
    return s + self._file.read(amt - len(s))
  File "/opt/lib/python2.7/socket.py", line 380, in read
    data = self._sock.recv(left)
  File "/opt/lib/python2.7/ssl.py", line 242, in recv
    return self.read(buflen)
  File "/opt/lib/python2.7/ssl.py", line 161, in read
    return self._sslobj.read(len)

但是没有Session对象的相同http请求可以正常工作。

print requests.get( host+'obj1/28/content', headers = headers, verify = False)
print requests.get( host+'obj2/1/content', headers = headers, verify = False)

1 个答案:

答案 0 :(得分:2)

来自requests文档:

  

好消息 - 多亏了urllib3,keep-alive是100%自动的   在一次会议中!您在会话中发出的任何请求都将是   自动重用适当的连接!

     

请注意,连接仅会释放回池中以供重复使用   一旦读完所有身体数据;一定要设置流到   False或读取Response对象的content属性。

听起来大型请求阻止了这种连接,或者,正如abarnert所暗示的那样,服务器存在问题。尝试设置stream=False,或访问该第一个res对象的内容,以便requests知道它可以释放该连接。

编辑:这看起来像是问题。当您致电requests.get时,您设置verify = False明确。这是不必要的,因为requests.get的默认值为False

但是,您的锁定位于adapter.send(request, **kwargs)。所以看起来HTTPAdapter对象有问题。 adapter.send具有以下签名:

 send(request, stream=False, timeout=None, verify=True, cert=None, proxies=None)

默认为verify=True

这听起来像requests中的错误,但我的猜测是verify参数未从Session传递下来。 sess.request的签名是:

request(method, url, params=None, data=None, headers=None, cookies=None, files=None, auth=None, timeout=None, allow_redirects=True, proxies=None, hooks=None, stream=None, verify=None, cert=None)

其中verify=None而不是False,所以也许这意味着它会在某处被覆盖。

尝试在verify=False中明确设置sess.get