Python http.client.Incomplete读取(0字节读取)错误

时间:2017-01-08 03:00:45

标签: python-3.x web-scraping beautifulsoup urllib

我在论坛上看到了这个错误并阅读了回复但我仍然不明白它是什么或如何解决它。我正在从16k链接中从互联网上抓取数据,我的脚本从每个链接中抓取类似信息并将其写入.csv,在此错误之前写入一些日期。

Traceback (most recent call last):
 File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 541, in _get_chunk_left
   chunk_left = self._read_next_chunk_size()
 File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 508, in _read_next_chunk_size
   return int(line, 16)
ValueError: invalid literal for int() with base 16: b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 558, in _readall_chunked
   chunk_left = self._get_chunk_left()
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 543, in _get_chunk_left
   raise IncompleteRead(b'')
http.client.IncompleteRead: IncompleteRead(0 bytes read)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "MoviesToDb.py", line 91, in <module>
html = r.read()
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 455, in read
   return self._readall_chunked()
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 565, in _readall_chunked
   raise IncompleteRead(b''.join(value))
http.client.IncompleteRead: IncompleteRead(17891 bytes read)

我想知道:
1)这个错误是什么意思?
2)我该如何预防?

2 个答案:

答案 0 :(得分:3)

尝试导入:

from http.client import IncompleteRead

并在脚本中添加:

except IncompleteRead:
    # Oh well, reconnect and keep trucking
        continue

答案 1 :(得分:-1)

enter image description here

requests.exceptions.ChunkedEncodingError: (‘Connection broken: IncompleteRead(0 bytes read)’, IncompleteRead(0 bytes read)).

这是因为http协议的服务器是1.0版本,而python使用的是1.1版本。解决方案是分配客户端的协议版本,像这样

Python3 版本请补充:

> import http.client
> http.client.HTTPConnection._http_vsn = 10
> http.client.HTTPConnection._http_vsn_str = 'HTTP/1.0'

Python2 版本请补充:

> import http.client
> http.client.HTTPConnection._http_vsn = 10
> http.client.HTTPConnection._http_vsn_str = 'HTTP/1.0'

见参考How to deal with "http.client.IncompleteRead: IncompleteRead(0 bytes read)" problem