单个会话多个post / get in python请求

时间:2015-05-04 15:07:43

标签: python web-crawler python-requests

我正在尝试使用python请求模块编写一个爬虫来自动下载一些文件。但是,我遇到了一个问题。

我初始化了一个新的请求会话,然后我使用post方法登录网站,之后只要我尝试使用post / get方法(下面的简化代码):

s=requests.session()
s.post(url,data=post_data, headers=headers)
#up to here everything is correct, the next step will report error 
s.get(url) or s.post(url) even repeat s.post(url,data=post_data, headers=headers) will report error 

它将报告如下错误:

Traceback (most recent call last):
File"/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 372, in _make_request
httplib_response = conn.getresponse(buffering=True)
TypeError: getresponse() got an unexpected keyword argument 'buffering'
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
body=body, headers=headers)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 374, in _make_request
httplib_response = conn.getresponse()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1162, in getresponse
raise ResponseNotReady(self.__state)
http.client.ResponseNotReady: Request-sent

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/adapters.py", line 370, in send
timeout=timeout
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 597, in urlopen
_stacktrace=sys.exc_info()[2])
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/util/retry.py", line 245, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/packages/six.py", line 309, in reraise
raise value.with_traceback(tb)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
body=body, headers=headers)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 374, in _make_request
httplib_response = conn.getresponse()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1162, in getresponse
raise ResponseNotReady(self.__state)
requests.packages.urllib3.exceptions.ProtocolError: ('Connection aborted.', ResponseNotReady('Request-sent',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "test.py", line 280, in <module>
test()
File "test.py", line 273, in test
emuch1.getEbook()
File "test.py", line 146, in getEbook
self.downloadEbook(ebook)
File "test.py", line 179, in downloadEbook
file_url=self.downloadEbookGetFileUrl(ebook).decode('gbk')
File "test.py", line 211, in downloadEbookGetFileUrl
download_url=self.downloadEbookGetUrl(ebook)
File "test.py", line 200, in downloadEbookGetUrl
respond_ebook=self.session.get(ebook_url)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/sessions.py", line 477, in get
return self.request('GET', url, **kwargs)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/requests/adapters.py", line 415, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ResponseNotReady('Request-sent',))

我完全不知道为什么会这样,有人能帮助我吗?

2 个答案:

答案 0 :(得分:3)

请求使用urllib3的内部版本。我的印象是内部urllib3与请求本身之间存在版本不匹配。

  

httplib_response = conn.getresponse(buffering = True)TypeError:   getresponse()得到了一个意想不到的关键字参数&#39;缓冲&#39;

似乎表示请求正在调用urllib3(内部版本,而不是Python),但是想要指定“缓冲”#39;哪个不存在。

其他问题与我的经历类似。

最新请求版本(2.6。*)存在一些其他问题,现在正在解决。我怀疑你使用的是那个版本。尝试回退以前的版本2.4.1,甚至2.2.1。如果在程序顶部指定要使用的版本,则可以保留最新版本:

__requires__ = ["requests==2.2.1"]
import pkg_resources

(至少在自己导入请求之前)

解决方案:上周我与开发团队交换了几封邮件,看起来他们已经很快就生成了2.7的修复程序! (事实上​​我看到它只在昨天上传)。因此,如果遇到类似问题,请下载最新版本!

答案 1 :(得分:0)

通过将请求升级到最新版本已经解决了这个问题,它可能是2.6中的错误(可能是这个版本,不太确定)。