两部分问题。我试图从互联网档案下载多个存档的Cory Doctorow播客。旧的那个没有进入我的iTunes提要。我编写了脚本,但下载的文件格式不正确。
Q1 - 我如何更改下载zip mp3文件? Q2 - 将变量传递给URL的更好方法是什么?
# and the base url.
def dlfile(file_name,file_mode,base_url):
from urllib2 import Request, urlopen, URLError, HTTPError
#create the url and the request
url = base_url + file_name + mid_url + file_name + end_url
req = Request(url)
# Open the url
try:
f = urlopen(req)
print "downloading " + url
# Open our local file for writing
local_file = open(file_name, "wb" + file_mode)
#Write to our local file
local_file.write(f.read())
local_file.close()
#handle errors
except HTTPError, e:
print "HTTP Error:",e.code , url
except URLError, e:
print "URL Error:",e.reason , url
# Set the range
var_range = range(150,153)
# Iterate over image ranges
for index in var_range:
base_url = 'http://www.archive.org/download/Cory_Doctorow_Podcast_'
mid_url = '/Cory_Doctorow_Podcast_'
end_url = '_64kb_mp3.zip'
#create file name based on known pattern
file_name = str(index)
dlfile(file_name,"wb",base_url
此脚本改编自here
答案 0 :(得分:52)
以下是我如何处理网址构建和下载。我确保将该文件命名为url的基本名称(尾随斜杠之后的最后一位),并且我还使用with
子句打开要写入的文件。这使用ContextManager这很好,因为当块退出时它将关闭该文件。另外,我使用模板为url构建字符串。 urlopen
不需要请求对象,只需要一个字符串。
import os
from urllib2 import urlopen, URLError, HTTPError
def dlfile(url):
# Open the url
try:
f = urlopen(url)
print "downloading " + url
# Open our local file for writing
with open(os.path.basename(url), "wb") as local_file:
local_file.write(f.read())
#handle errors
except HTTPError, e:
print "HTTP Error:", e.code, url
except URLError, e:
print "URL Error:", e.reason, url
def main():
# Iterate over image ranges
for index in range(150, 151):
url = ("http://www.archive.org/download/"
"Cory_Doctorow_Podcast_%d/"
"Cory_Doctorow_Podcast_%d_64kb_mp3.zip" %
(index, index))
dlfile(url)
if __name__ == '__main__':
main()
答案 1 :(得分:2)