通过memory_profiler
检测到内存泄漏。由于将从128MB
GCF 或f1-micro
GCE 上传如此大的文件,我该如何防止这种内存泄漏?
✗ python -m memory_profiler tests/test_gcp_storage.py
67108864
Filename: tests/test_gcp_storage.py
Line # Mem usage Increment Line Contents
================================================
48 35.586 MiB 35.586 MiB @profile
49 def test_upload_big_file():
50 35.586 MiB 0.000 MiB from google.cloud import storage
51 35.609 MiB 0.023 MiB client = storage.Client()
52
53 35.609 MiB 0.000 MiB m_bytes = 64
54 35.609 MiB 0.000 MiB filename = int(datetime.utcnow().timestamp())
55 35.609 MiB 0.000 MiB blob_name = f'test/{filename}'
56 35.609 MiB 0.000 MiB bucket_name = 'my_bucket'
57 38.613 MiB 3.004 MiB bucket = client.get_bucket(bucket_name)
58
59 38.613 MiB 0.000 MiB with open(f'/tmp/{filename}', 'wb+') as file_obj:
60 38.613 MiB 0.000 MiB file_obj.seek(m_bytes * 1024 * 1024 - 1)
61 38.613 MiB 0.000 MiB file_obj.write(b'\0')
62 38.613 MiB 0.000 MiB file_obj.seek(0)
63
64 38.613 MiB 0.000 MiB blob = bucket.blob(blob_name)
65 102.707 MiB 64.094 MiB blob.upload_from_file(file_obj)
66
67 102.715 MiB 0.008 MiB blob = bucket.get_blob(blob_name)
68 102.719 MiB 0.004 MiB print(blob.size)
此外,如果未使用二进制模式打开文件,则内存泄漏将是文件大小的两倍。
67108864 Filename: tests/test_gcp_storage.py Line # Mem usage Increment Line Contents ================================================ 48 35.410 MiB 35.410 MiB @profile 49 def test_upload_big_file(): 50 35.410 MiB 0.000 MiB from google.cloud import storage 51 35.441 MiB 0.031 MiB client = storage.Client() 52 53 35.441 MiB 0.000 MiB m_bytes = 64 54 35.441 MiB 0.000 MiB filename = int(datetime.utcnow().timestamp()) 55 35.441 MiB 0.000 MiB blob_name = f'test/{filename}' 56 35.441 MiB 0.000 MiB bucket_name = 'my_bucket' 57 38.512 MiB 3.070 MiB bucket = client.get_bucket(bucket_name) 58 59 38.512 MiB 0.000 MiB with open(f'/tmp/{filename}', 'w+') as file_obj: 60 38.512 MiB 0.000 MiB file_obj.seek(m_bytes * 1024 * 1024 - 1) 61 38.512 MiB 0.000 MiB file_obj.write('\0') 62 38.512 MiB 0.000 MiB file_obj.seek(0) 63 64 38.512 MiB 0.000 MiB blob = bucket.blob(blob_name) 65 152.250 MiB 113.738 MiB blob.upload_from_file(file_obj) 66 67 152.699 MiB 0.449 MiB blob = bucket.get_blob(blob_name) 68 152.703 MiB 0.004 MiB print(blob.size)
GIST:https://gist.github.com/northtree/8b560a6b552a975640ec406c9f701731
答案 0 :(得分:0)
要限制上传期间使用的内存量,您需要在调用upload_from_file()
之前在blob上显式配置块大小:
blob = bucket.blob(blob_name, chunk_size=10*1024*1024)
blob.upload_from_file(file_obj)
我同意这是Google客户端SDK的默认默认行为,并且该解决方法的记录也很差。