用http.client正确下载.gz

时间:2017-06-28 15:52:47

标签: python

我目前正在从服务器上下载.tar.gz个文件:

conn = http.client.HTTPSConnection(host = host,
                                   port = port,
                                   cert_file = pem,
                                   key_file = key,
                                   context = ssl.SSLContext(ssl.PROTOCOL_TLS))

conn.request('GET', url)

rsp = conn.getresponse()

fp = r"H:\path\to\new.tar.gz"

with open(fp, 'wb') as f:
    while True:
        piece = rps.read(4096)
        if not piece:
            break
        f.write(piece)

但是我担心这种方法会导致压缩问题,因为文件有时会保留 gzipped ,有时则不会。

问题:

使用gzip模块从套接字流中保存文件的适当方法是什么?

支持信息:

我做了以下事情:

conn = http.client.HTTPSConnection(host = host,
                                       port = port,
                                       cert_file = pem,
                                       key_file = key,
                                       context = ssl.SSLContext(ssl.PROTOCOL_TLS))

conn.request('GET', url)

rsp = conn.getresponse()

fp = r"H:\path\to\new.tar"

f_like_obj = io.BytesIO()
f_like_obj.write(rsp.read())
f_like_obj.seek(0)
f_decomp = gzip.GzipFile(fileobj=f_like_obj, mode='rb')

with open(fp, 'wb') as f:
    f.write(f_decomp.read())

但是有时同时在两个单独的时间下载的同一个文件会出错:

"Not a gzipped file (b'<!')"

1 个答案:

答案 0 :(得分:1)

试试这个:

导入http.client import gzip

conn = http.client.HTTPSConnection(host = host,
                                       port = port,
                                       cert_file = pem,
                                       key_file = key,
                                       context = ssl.SSLContext(ssl.PROTOCOL_TLS))

conn.request('GET', url)

rsp = conn.getresponse()

fp = r"H:\path\to\new.tar"

with gzip.GzipFile(fileobj=rsp) as decomp, open(fp, 'wb') as f:
    f.write(decomp.read())