在Python脚本中从文件读取GZip错误

时间:2014-08-12 21:38:38

标签: python gzip compression

我有一个.gz文件,我正在尝试打开并解析以放入数据库。运行以下代码......

def process_file(filename):
    with gzip.GzipFile(filename, 'rU', 9) as uncompressed_file:
        uncompressed_file.next()  # Skip the headers

        for line in uncompressed_file:
            line = line.replace('\n', '').split('\t')
            # Do some more stuff with the line

生成此错误...

    File "path/to/script", line 169, in process_file
        uncompressed_file.next()  # Skip the headers
    File "/usr/lib/python2.7/gzip.py", line 450, in readline
        c = self.read(readsize)
    File "/usr/lib/python2.7/gzip.py", line 256, in read
        self._read(readsize)
    File "/usr/lib/python2.7/gzip.py", line 307, in _read
        uncompress = self.decompress.decompress(buf)
error: Error -3 while decompressing: invalid distance too far back

特别奇怪的是,代码在我的本地计算机(Mac OSX 10.9.4)上运行良好,但在我的服务器(Ubuntu 12.04.4 LTS)上运行不正常。

任何见解都值得赞赏,因为我目前没有想法。

1 个答案:

答案 0 :(得分:1)

解决了这个问题。似乎在所有平台上都不完全支持在gzip.open或gzip.GzipFile上使用'with'语法。

目前还不清楚为什么它不能始终如一地工作,但转向此代码解决了这个问题:

def process_file(filename):
    f = gzip.open(filename, 'rb')
    f.next() # Skip the headers

    for line in f:
        line = line.replace('\n', '').split('\t')

    f.close()