我很难解决这个内存使用问题
目标简介:
要创建许多文件的zip并将其保存在S3上
信息和已采取的步骤:
文件是图像,大小在10至50mb之间,并且有数千个。
我已将Lambda调整为具有3GB的内存,我希望每个zip可以存储2.5-3GB的文件。
问题:
当前,我有一个准确的测试文件,大小为1GB。问题是Lambda报告使用的内存大约为2085。我知道会有开销,但是为什么要2GB?
伸展目标:
要创建大于3GB的zip
测试和结果:
如果我注释掉in_memory.seek(0)和Object.Put到S3,则不会发生内存加倍(但也不会创建文件的情况……)
当前代码:
bucket = 'dev'
s3 = boto3.resource('s3')
list = [
"test/3g/file1"
]
list_index = 1
# Setup a memory store for streaming into a zip file
in_memory = BytesIO()
# Initiate zip file with "append" settings
zf = ZipFile(in_memory, mode="a")
# Iterate over contents in S3 bucket with prefix, filtering empty keys
for key in list:
object = s3.Object(bucket, key).get()
# Grab the filename without the key to create a "flat" zip file
filename = key.split('/')[-1]
# Stream the body of each key into the zip
zf.writestr(filename, object['Body'].read())
# Close the zip file
zf.close()
# Go to beginning of memory store
in_memory.seek(0)
# Stream zip file to S3
object = s3.Object('dev', 'test/export-3G-' + str(list_index) + '.zip')
object.put(Body=in_memory.read())