Zipfile无法解压缩NFS驱动器中存在的文件

时间:2019-11-23 00:11:43

标签: python-3.x celery zipfile nfs

现有工作流程

Celery Task定期将一个zip文件从S3存储桶中拉到/ tmp /,然后zipfile将其提取到新文件夹中进行处理。

挑战:该zip文件的大小可变,这使我不得不添加大量的块存储,并且出现了磁盘空间不足的错误。

新工作流程

我在/ efs /路径中安装了一个AWS EFS驱动器。

Celery Task定期将S3存储桶中的zip文件拉到/ efs /,然后zipfile将其提取到新文件夹中进行处理。但是,该zipfile无法通过以下跟踪信息提取/ efs /驱动器上存在的文件:

[2019-11-22 23:32:51,697: WARNING/ForkPoolWorker-3] Downloaded file
[2019-11-22 23:32:51,697: WARNING/ForkPoolWorker-3] Stored the file at
[2019-11-22 23:32:51,697: WARNING/ForkPoolWorker-3] /efs/fcb3b551-dcc5-48fb-a8ca-7cf6fd2f9cfd.zip
[2019-11-22 23:32:54,410: ERROR/ForkPoolWorker-3] Task analyzer.tasks.convert_csv[834df7ff-4dcf-4f04-b3b5-9956da867cf4] raised unexpected: error('Error -3 while decompressing data: invalid code lengths set',)
Traceback (most recent call last):
  File "/home/ubuntu/kirke/kirke_env/lib/python3.6/site-packages/celery/app/trace.py", line 375, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/home/ubuntu/kirke/kirke_env/lib/python3.6/site-packages/celery/app/trace.py", line 632, in __protected_call__
    return self.run(*args, **kwargs)
  File "/home/ubuntu/kirke/backend/analyzer/tasks.py", line 179, in convert_csv
    zip_ref.extractall(temp_path)
  File "/usr/lib/python3.6/zipfile.py", line 1524, in extractall
    self._extract_member(zipinfo, path, pwd)
  File "/usr/lib/python3.6/zipfile.py", line 1579, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.6/shutil.py", line 79, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.6/zipfile.py", line 872, in read
    data = self._read1(n)
  File "/usr/lib/python3.6/zipfile.py", line 948, in _read1
    data = self._decompressor.decompress(data, n)
zlib.error: Error -3 while decompressing data: invalid code lengths set

一个奇怪的事实是,当我运行普通的python shell并尝试提取/ efs /中存在的zip文件时,它可以正常工作!

我的环境

aiodns==2.0.0
aiohttp==3.6.2
amqp==2.5.2
async==0.6.2
async-timeout==3.0.1
attrs==19.3.0
Babel==2.7.0
billiard==3.5.0.5
boto3==1.10.25
botocore==1.12.197
celery==4.1.1
certifi==2019.9.11
cffi==1.13.2
chardet==3.0.4
Django==2.2.3
django-enumfields==1.0.0
django-extensions==2.2.1
djangorestframework==3.10.1
docutils==0.14
flower==0.9.3
gunicorn==19.9.0
idna==2.8
idna-ssl==1.1.0
importlib-metadata==0.23
jmespath==0.9.4
kombu==4.6.6
more-itertools==7.2.0
multidict==4.6.1
pkg-resources==0.0.0
psycopg2-binary==2.8.3
pycares==3.0.0
pycparser==2.19
python-dateutil==2.8.0
pytz==2019.1
redis==3.3.11
requests==2.22.0
s3transfer==0.2.1
six==1.13.0
slackclient==2.0.0
sqlparse==0.3.0
tornado==5.1.1
typing==3.7.4
typing-extensions==3.7.4
urllib3==1.25.7
vine==1.3.0
websocket-client==0.56.0
yarl==1.3.0
zipp==0.6.0

有什么想法吗?建议?

0 个答案:

没有答案