我需要下载一个相当大的(~200MB)文件。我想出了如何使用here下载和保存文件。如果有进度条可以知道已下载了多少,那将是一件好事。我找到ProgressBar,但我不确定如何将两者合并在一起。
这是我尝试的代码,但它没有用。
bar = progressbar.ProgressBar(max_value=progressbar.UnknownLength)
with closing(download_file()) as r:
for i in range(20):
bar.update(i)
答案 0 :(得分:47)
我建议你试试tqdm
[1],这很容易使用。
使用requests
库下载示例代码[2]:
from tqdm import tqdm
import requests
import math
url = "http://example.com/bigfile.bin"
# Streaming, so we can iterate over the response.
r = requests.get(url, stream=True)
# Total size in bytes.
total_size = int(r.headers.get('content-length', 0));
block_size = 1024
wrote = 0
with open('output.bin', 'wb') as f:
for data in tqdm(r.iter_content(block_size), total=math.ceil(total_size//block_size) , unit='KB', unit_scale=True):
wrote = wrote + len(data)
f.write(data)
if total_size != 0 and wrote != total_size:
print("ERROR, something went wrong")
[1]:https://github.com/tqdm/tqdm
[2]:http://docs.python-requests.org/en/master/
答案 1 :(得分:4)
似乎Progress Bar Usage页面上的示例与代码实际需要的内容之间存在脱节。
在以下示例中,请注意使用maxval
而不是max_value
。另请注意使用.start()
初始化栏。这已在Issue中注明。
import progressbar
import requests
url = "http://stackoverflow.com/"
def download_file(url):
local_filename = 'test.html'
r = requests.get(url, stream=True)
f = open(local_filename, 'wb')
file_size = int(r.headers['Content-Length'])
chunk = 1
num_bars = file_size / chunk
bar = progressbar.ProgressBar(maxval=num_bars).start()
i = 0
for chunk in r.iter_content():
f.write(chunk)
bar.update(i)
i+=1
f.close()
return
download_file(url)
答案 2 :(得分:2)
您似乎需要获取远程文件大小(answered here)来计算您的距离。
然后,您可以在处理每个块时更新进度条...如果您知道块的总大小和大小,则可以确定何时更新进度条。
答案 3 :(得分:2)
还可以使用python库enlighten,它功能强大,提供丰富多彩的进度条,并且可以在Linux,Windows中正常工作。
下面是代码+实时截屏。该代码可以运行here on repl.it。
import math
import requests, enlighten
url = 'https://upload.wikimedia.org/wikipedia/commons/a/ae/Arthur_Streeton_-_Fire%27s_on_-_Google_Art_Project.jpg?download'
fname = 'image.jpg'
# Should be one global variable
MANAGER = enlighten.get_manager()
r = requests.get(url, stream = True)
assert r.status_code == 200, r.status_code
dlen = int(r.headers.get('Content-Length', '0')) or None
with MANAGER.counter(color = 'green', total = dlen and math.ceil(dlen / 2 ** 20), unit = 'MiB', leave = False) as ctr, \
open(fname, 'wb', buffering = 2 ** 24) as f:
for chunk in r.iter_content(chunk_size = 2 ** 20):
print(chunk[-16:].hex().upper())
f.write(chunk)
ctr.update()
答案 4 :(得分:1)
tqdm有一个答案。
def download(url, fname):
resp = requests.get(url, stream=True)
total = int(resp.headers.get('content-length', 0))
with open(fname, 'wb') as file, tqdm(
desc=fname,
total=total,
unit='iB',
unit_scale=True,
unit_divisor=1024,
) as bar:
for data in resp.iter_content(chunk_size=1024):
size = file.write(data)
bar.update(size)
Gits:https://gist.github.com/yanqd0/c13ed29e29432e3cf3e7c38467f42f51
答案 5 :(得分:1)
tqdm
软件包现在包括一个专门针对此类情况设计的功能:wrapattr
。您只需包装对象的read
(或write
)属性,然后tqdm会处理其余的属性。不会混淆块大小或类似的东西。这是一个简单的下载功能,将其与requests
整合在一起:
def download(url, filename):
import functools
import pathlib
import shutil
import requests
from tqdm.auto import tqdm
r = requests.get(url, stream=True, allow_redirects=True)
if r.status_code != 200:
r.raise_for_status() # Will only raise for 4xx codes, so...
raise RuntimeError(f"Request to {url} returned status code {r.status_code}")
file_size = int(r.headers.get('Content-Length', 0))
path = pathlib.Path(filename).expanduser().resolve()
path.parent.mkdir(parents=True, exist_ok=True)
desc = "(Unknown total file size)" if file_size == 0 else ""
r.raw.read = functools.partial(r.raw.read, decode_content=True) # Decompress if needed
with tqdm.wrapattr(r.raw, "read", total=file_size, desc=desc) as r_raw:
with path.open("wb") as f:
shutil.copyfileobj(r_raw, f)
return path
答案 6 :(得分:1)
用您已经下载的大小计算文件大小会发现您的距离。或者,您可以使用tqdm。