我使用下面的功能从http Server下载文件。 代码正在运行,但需要根据下载的块进行优化。
def download_bits(bits):
for link in bits:
filename = link.split('/')[-1]
print filename
print "Downloading File : {0}".format(filename)
resp = requests.get(link, stream=True)
with open(filename, 'wb') as f:
for chunk in resp.iter_content(chunk_size=1024*1024):
if chunk:
f.write(chunk)
print "File {0} Downloaded.! ".format(filename)
print "All Downloads are Completed.!"
我已执行此代码以验证功能,并且远程连接已关闭连接。我怎么能避免这个?
Downloading file:CentOS-7-x86_64-DVD-1708.iso
CentOS-7-x86_64-DVD-1708.iso downloaded!
Downloading file:CentOS-7-x86_64-Everything-1708.iso
---------------------------------------------------------------------------
ChunkedEncodingError Traceback (most recent call last)
<ipython-input-43-381f54eff57b> in <module>()
15 # download started
16 with open(file_name, 'wb') as f:
---> 17 for chunk in r.iter_content(chunk_size = 1024*1024):
18 if chunk:
19 f.write(chunk)
C:\ProgramData\Anaconda2\lib\site-packages\requests\models.pyc in generate()
739 yield chunk
740 except ProtocolError as e:
--> 741 raise ChunkedEncodingError(e)
742 except DecodeError as e:
743 raise ContentDecodingError(e)
ChunkedEncodingError: ("Connection broken: error(10054, 'An existing connection was forcibly closed by the remote host')", error(10054, 'An existing connection was forcibly closed by the remote host'))
答案 0 :(得分:1)
如果您使用stream=True
,requests
将不会自行下载响应正文,因此它将保持连接打开,直到您使用.iter_content()
或.iter_lines()
来阅读整个回应。
如果在没有活动的情况下打开太长时间,服务器可能会选择关闭连接。
很难说出为什么会发生这种情况,但我会说你的块大小对于底层网络来说太大了。
尝试chunk_size=None
让requests
为您提供网络接口收到的块:
with open(filename, 'wb') as foutput:
for chunk in resp.iter_content(chunk_size=None):
if chunk:
foutput.write(chunk)
不要担心优化块大小,因为OS I / O层有自己的磁盘写入缓冲区。