使用python urllib根据文件类型下载文件并重命名

时间:2013-11-26 07:33:22

标签: python python-2.7 beautifulsoup

我有这段代码:

import urllib
from bs4 import BeautifulSoup
url = "http://www.downloadcrew.com/article/28976-flicflac"
pageurl = urllib.urlopen(url)
soup = BeautifulSoup(pageurl)
app_name = soup.find('div',{'id':'articleTop'}).find('h1',{'id':'articleTitle'}).contents[0].strip()
download_link = "http://www.downloadcrew.com"+soup.find('div',{'class':'downloadLink'}).find('a')['href'].split(',')[1].strip().strip("'")
source = urllib.urlopen(download_link).read()
print "Downloading: "+(app_name)
filename = (app_name)
files = open(filename,'w')
files.write(source)
files.close()

当我运行此代码时,下载的文件应为name' flicflac.zip' 但我得到的不是' flicflac.zip'。这是一个文件扩展名。 如何让它自动命名如上?

1 个答案:

答案 0 :(得分:3)

您可以检查文件内容类型,并相应地添加扩展名:

from mimetypes import guess_extension

source = urllib.urlopen(download_link)
extension = guess_extension(source.info()['Content-Type'])
if extension:
    app_name += extension
else:
    # what to do? discard?
    pass

# later do source.read()