我需要从网站下载和解压缩文件。以下是我使用的代码:
#!/usr/bin/python
#geoipFolder = r'/my/folder/path/ ' #Mac/Linux folder path
geoipFolder = r'D:\my\folder\path\ ' #Windows folder path
geoipFolder = geoipFolder[:-1] #workaround for Windows escaping trailing quote
geoipName = 'GeoIPCountryWhois'
geoipURL = 'http://geolite.maxmind.com/download/geoip/database/GeoIPCountryCSV.zip'
import urllib2
response = urllib2.urlopen(geoipURL)
f = open('%s.zip' % (geoipFolder+geoipName),"w")
f.write(repr(response.read()))
f.close()
import zipfile
zip = zipfile.ZipFile(r'%s.zip' % (geoipFolder+geoipName))
zip.extractall(r'%s' % geoipFolder)
此代码适用于Mac和Linux机器,但不适用于Windows。在那里,写了.zip文件,但脚本抛出了这个错误:
zipfile.BadZipfile: File is not a zip file
我也无法使用Windows资源管理器解压缩文件。它说:
The compressed (zipped) folder is empty.
但磁盘上的文件大小为6MB。
关于我在Windows上做错了什么的想法?
谢谢
答案 0 :(得分:3)
您的zipfile在Windows上已损坏,因为您以书写/文本模式打开文件(行终止符转换会破坏二进制数据):
f = open('%s.zip' % (geoipFolder+geoipName),"w")
您必须以写入/二进制模式打开,如下所示:
f = open('%s.zip' % (geoipFolder+geoipName),"wb")
(当然仍然适用于Linux)
总结一下,使用with
块(并删除repr
),这是一种更加pythonic的方式:
with open('{}{}.zip'.format(geoipFolder,geoipName),"wb") as f:
f.write(response.read())
编辑:无需将文件写入磁盘,您可以使用io.BytesIO
,因为ZipFile
对象接受文件句柄作为第一个参数。
import io
import zipfile
with open('{}{}.zip'.format(geoipFolder,geoipName),"wb") as f:
outbuf = io.BytesIO(f.read())
zip = zipfile.ZipFile(outbuf) # pass the fake-file handle: no disk write, no temp file
zip.extractall(r'%s' % geoipFolder)