在Linux(Ubuntu)上,当我运行wget www.example.com/file.zip -O file.zip
时,我看到一个表示下载进度的进度条。如下图所示:
Python中有没有办法检索我用红色包围的所有信息?
我的意思是我希望将这些信息检索到单独的Python变量中:
答案 0 :(得分:2)
您可以使用wget
库和urllib
的自定义函数在Python中实现自己的reporthook
def reporthook(count_blocks, block_size, total_size):
global start_time
if count == 0:
start_time = time.time()
return
duration = time.time() - start_time
progress_size = int(count_blocks * block_size)
print "downloaded %f%%" % count_blocks/float(total_size)
# etc ...
urllib.urlretrieve(url, filename, reporthook)
(另见https://stackoverflow.com/a/4152008/2314737)
这是一个完整的Python 3实现:https://pypi.python.org/pypi/wget
答案 1 :(得分:0)
您可以使用subprocess:
import subprocess
process = subprocess.Popen(
['wget', 'http://speedtest.dal01.softlayer.com/downloads/test10.zip', '-O', '/dev/null'],
stderr=subprocess.PIPE)
started = False
for line in process.stderr:
line = line.decode("utf-8", "replace")
if started:
print(line.split())
elif line == os.linesep:
started = True
现在您只需要解析line.split()
输出并更改wget
参数(这仅用于测试而不保存下载的数据)。
这适用于使用python 3.4的Windows:
import subprocess
import os
import sys
wget = os.path.join("C:\\" , "Program Files (x86)", "GnuWin32", "bin", "wget.exe")
process = subprocess.Popen(
[wget, 'http://speedtest.dal01.softlayer.com/downloads/test10.zip', '-O', 'NUL'],
stderr=subprocess.PIPE)
started = False
for line in process.stderr:
line = line.decode("utf-8", "replace")
if started:
splited = line.split()
if len(splited) == 9:
percentage = splited[6]
speed = splited[7]
remaining = splited[8]
print("Downloaded {} with {} per second and {} left.".format(percentage, speed, remaining), end='\r')
elif line == os.linesep:
started = True
答案 2 :(得分:0)
由于这些信息输出到stderr
,因此您需要从sys.stderr中读取它们。
我们可以使用select来读取stderr,因为输出正在改变。
仅供参考,以下是一个例子:
# -*- coding: utf-8 -*-
from subprocess import PIPE, Popen
import fcntl
import os
import select
import sys
proc = Popen(['wget', 'http://speedtest.london.linode.com/100MB-london.bin'], stdin = PIPE, stderr = PIPE, stdout = PIPE)
while proc.poll() == None:
fcntl.fcntl(
proc.stderr.fileno(),
fcntl.F_SETFL,
fcntl.fcntl(proc.stderr.fileno(), fcntl.F_GETFL) | os.O_NONBLOCK,
)
buf = ''
while proc.poll() == None:
readx_err = select.select([proc.stderr.fileno()], [], [], 0.1)[0]
if readx_err:
chunk = proc.stderr.read().decode('utf-8')
buf += chunk
if '\n' in buf and '%' in buf and '.' in buf:
print (buf.strip().split())
buf = ''
else:
break
proc.wait()