我试图让urllib下载目录中所有以.gz结尾的文件。我的代码运行没有错误,但未下载任何内容。我不确定自己在做什么错。请帮助我。
from urllib import *
directory = 'https://eogdata.mines.edu/wwwdata/viirs_products/vnf/v30'
with request.urlopen(directory) as doc:
for line in doc:
if line.endswith(b'gz'):
urllib.request.retrieve(line)
答案 0 :(得分:0)
您的脚本中存在一些错误,首先您需要解析url中的文件,然后检查它是否为gz
文件
我尝试使用urllib2
import urllib2
import re
directory = 'https://eogdata.mines.edu/wwwdata/viirs_products/vnf/v30/'
sock = urllib2.urlopen(directory)
sock.close()
found_files = re.findall(r'href="(.*?)"', sock.read()) # here you parse all the files available for download
for file in found_files:
if file.endswith('gz'):
file_location = directory+file # the gz file location
print "downloading %s from %s" % (file, file_location)
file_download = urllib2.urlopen(file_location) # get file from url
with open(file, "wb") as local_file: # open a file with the same name of our gz file
local_file.write(file_download.read()) # write data to our file
file_download.close()