我有一个.gz文件,我正在尝试打开并解析以放入数据库。运行以下代码......
def process_file(filename):
with gzip.GzipFile(filename, 'rU', 9) as uncompressed_file:
uncompressed_file.next() # Skip the headers
for line in uncompressed_file:
line = line.replace('\n', '').split('\t')
# Do some more stuff with the line
生成此错误...
File "path/to/script", line 169, in process_file
uncompressed_file.next() # Skip the headers
File "/usr/lib/python2.7/gzip.py", line 450, in readline
c = self.read(readsize)
File "/usr/lib/python2.7/gzip.py", line 256, in read
self._read(readsize)
File "/usr/lib/python2.7/gzip.py", line 307, in _read
uncompress = self.decompress.decompress(buf)
error: Error -3 while decompressing: invalid distance too far back
特别奇怪的是,代码在我的本地计算机(Mac OSX 10.9.4)上运行良好,但在我的服务器(Ubuntu 12.04.4 LTS)上运行不正常。
任何见解都值得赞赏,因为我目前没有想法。
答案 0 :(得分:1)
解决了这个问题。似乎在所有平台上都不完全支持在gzip.open或gzip.GzipFile上使用'with'语法。
目前还不清楚为什么它不能始终如一地工作,但转向此代码解决了这个问题:
def process_file(filename):
f = gzip.open(filename, 'rb')
f.next() # Skip the headers
for line in f:
line = line.replace('\n', '').split('\t')
f.close()