使用请求通过http下载文件时进度条

时间:2016-06-01 15:53:43

标签: python python-requests

我需要下载一个相当大的(~200MB)文件。我想出了如何使用here下载和保存文件。如果有进度条可以知道已下载了多少,那将是一件好事。我找到ProgressBar,但我不确定如何将两者合并在一起。

这是我尝试的代码,但它没有用。

bar = progressbar.ProgressBar(max_value=progressbar.UnknownLength)
with closing(download_file()) as r:
    for i in range(20):
        bar.update(i)

7 个答案:

答案 0 :(得分:47)

我建议你试试tqdm [1],这很容易使用。 使用requests库下载示例代码[2]:

from tqdm import tqdm
import requests
import math


url = "http://example.com/bigfile.bin"
# Streaming, so we can iterate over the response.
r = requests.get(url, stream=True)

# Total size in bytes.
total_size = int(r.headers.get('content-length', 0)); 
block_size = 1024
wrote = 0 
with open('output.bin', 'wb') as f:
    for data in tqdm(r.iter_content(block_size), total=math.ceil(total_size//block_size) , unit='KB', unit_scale=True):
        wrote = wrote  + len(data)
        f.write(data)
if total_size != 0 and wrote != total_size:
    print("ERROR, something went wrong")  

[1]:https://github.com/tqdm/tqdm
[2]:http://docs.python-requests.org/en/master/

答案 1 :(得分:4)

似乎Progress Bar Usage页面上的示例与代码实际需要的内容之间存在脱节。

在以下示例中,请注意使用maxval而不是max_value。另请注意使用.start()初始化栏。这已在Issue中注明。

import progressbar
import requests

url = "http://stackoverflow.com/"


def download_file(url):
    local_filename = 'test.html'
    r = requests.get(url, stream=True)
    f = open(local_filename, 'wb')
    file_size = int(r.headers['Content-Length'])
    chunk = 1
    num_bars = file_size / chunk
    bar =  progressbar.ProgressBar(maxval=num_bars).start()
    i = 0
    for chunk in r.iter_content():
        f.write(chunk)
        bar.update(i)
        i+=1
    f.close()
    return

download_file(url)

答案 2 :(得分:2)

您似乎需要获取远程文件大小(answered here)来计算您的距离。

然后,您可以在处理每个块时更新进度条...如果您知道块的总大小和大小,则可以确定何时更新进度条。

答案 3 :(得分:2)

还可以使用python库enlighten,它功能强大,提供丰富多彩的进度条,并且可以在Linux,Windows中正常工作。

下面是代码+实时截屏。该代码可以运行here on repl.it

import math
import requests, enlighten

url = 'https://upload.wikimedia.org/wikipedia/commons/a/ae/Arthur_Streeton_-_Fire%27s_on_-_Google_Art_Project.jpg?download'
fname = 'image.jpg'

# Should be one global variable
MANAGER = enlighten.get_manager()

r = requests.get(url, stream = True)
assert r.status_code == 200, r.status_code
dlen = int(r.headers.get('Content-Length', '0')) or None

with MANAGER.counter(color = 'green', total = dlen and math.ceil(dlen / 2 ** 20), unit = 'MiB', leave = False) as ctr, \
     open(fname, 'wb', buffering = 2 ** 24) as f:
    for chunk in r.iter_content(chunk_size = 2 ** 20):
        print(chunk[-16:].hex().upper())
        f.write(chunk)
        ctr.update()

asciicast

答案 4 :(得分:1)

tqdm有一个答案。

def download(url, fname):
    resp = requests.get(url, stream=True)
    total = int(resp.headers.get('content-length', 0))
    with open(fname, 'wb') as file, tqdm(
            desc=fname,
            total=total,
            unit='iB',
            unit_scale=True,
            unit_divisor=1024,
    ) as bar:
        for data in resp.iter_content(chunk_size=1024):
            size = file.write(data)
            bar.update(size)

Gits:https://gist.github.com/yanqd0/c13ed29e29432e3cf3e7c38467f42f51

答案 5 :(得分:1)

tqdm软件包现在包括一个专门针对此类情况设计的功能:wrapattr。您只需包装对象的read(或write)属性,然后tqdm会处理其余的属性。不会混淆块大小或类似的东西。这是一个简单的下载功能,将其与requests整合在一起:

def download(url, filename):
    import functools
    import pathlib
    import shutil
    import requests
    from tqdm.auto import tqdm
    
    r = requests.get(url, stream=True, allow_redirects=True)
    if r.status_code != 200:
        r.raise_for_status()  # Will only raise for 4xx codes, so...
        raise RuntimeError(f"Request to {url} returned status code {r.status_code}")
    file_size = int(r.headers.get('Content-Length', 0))

    path = pathlib.Path(filename).expanduser().resolve()
    path.parent.mkdir(parents=True, exist_ok=True)

    desc = "(Unknown total file size)" if file_size == 0 else ""
    r.raw.read = functools.partial(r.raw.read, decode_content=True)  # Decompress if needed
    with tqdm.wrapattr(r.raw, "read", total=file_size, desc=desc) as r_raw:
        with path.open("wb") as f:
            shutil.copyfileobj(r_raw, f)

    return path

答案 6 :(得分:1)

用您已经下载的大小计算文件大小会发现您的距离。或者,您可以使用tqdm。