我想将bz2 url中的数据直接解压缩到目标文件。这是代码:
filename = 'temp.file'
req = urllib2.urlopen('http://example.com/file.bz2')
CHUNK = 16 * 1024
with open(filename, 'wb') as fp:
while True:
chunk = req.read(CHUNK)
if not chunk: break
fp.write(bz2.decompress(chunk))
fp.close()
bz2.decompress(chunk)出错 - ValueError:无法找到流的结尾
答案 0 :(得分:3)
使用bz2.BZ2Decompressor
执行顺序解压缩:
filename = 'temp.file'
req = urllib2.urlopen('http://example.com/file.bz2')
CHUNK = 16 * 1024
decompressor = bz2.BZ2Decompressor()
with open(filename, 'wb') as fp:
while True:
chunk = req.read(CHUNK)
if not chunk:
break
fp.write(decompressor.decompress(chunk))
req.close()
顺便说一句,只要您使用fp.close()
声明,就不需要致电with
。
答案 1 :(得分:2)
您应该使用支持增量解压缩的BZ2Decompressor
。见https://docs.python.org/2/library/bz2.html#bz2.BZ2Decompressor
我还没有调试过这个,但它应该像这样工作:
filename = 'temp.file'
req = urllib2.urlopen('http://example.com/file.bz2')
CHUNK = 16 * 1024
decompressor = bz.BZ2Decompressor()
with open(filename, 'wb') as fp:
while True:
chunk = req.read(CHUNK)
if not chunk: break
decomp = decompressor.decompress(chunk)
if decomp:
fp.write(decomp)
答案 2 :(得分:0)
这是在流式传输模式下使用requests
的更直接有效的方法:
req = requests.get('http://example.com/file.bz2', stream=True)
with open(filename, 'wb') as fp:
shutil.copyfileobj(req.raw, fp)