我有这段代码:
import urllib
from bs4 import BeautifulSoup
url = "http://www.downloadcrew.com/article/28976-flicflac"
pageurl = urllib.urlopen(url)
soup = BeautifulSoup(pageurl)
app_name = soup.find('div',{'id':'articleTop'}).find('h1',{'id':'articleTitle'}).contents[0].strip()
download_link = "http://www.downloadcrew.com"+soup.find('div',{'class':'downloadLink'}).find('a')['href'].split(',')[1].strip().strip("'")
source = urllib.urlopen(download_link).read()
print "Downloading: "+(app_name)
filename = (app_name)
files = open(filename,'w')
files.write(source)
files.close()
当我运行此代码时,下载的文件应为name' flicflac.zip' 但我得到的不是' flicflac.zip'。这是一个文件扩展名。 如何让它自动命名如上?
答案 0 :(得分:3)
您可以检查文件内容类型,并相应地添加扩展名:
from mimetypes import guess_extension
source = urllib.urlopen(download_link)
extension = guess_extension(source.info()['Content-Type'])
if extension:
app_name += extension
else:
# what to do? discard?
pass
# later do source.read()