指数回退以在python客户端中批量删除对象

时间:2018-09-12 23:13:40

标签: google-cloud-storage

我的请求看起来像Batch request with Google Cloud Storage python client

blobs_to_delete = [blob for blob in bucket.list_blobs(prefix="my/prefix/here")]

    for c in _chunk(blobs, batch_size=100):
        with storage_client.batch():
            for blob in c:
                blob.delete()

错误是:

[2018-09-12 21:28:41,726] {base_task_runner.py:98} INFO - Subtask:   File "/usr/local/lib/python2.7/site-packages/google/cloud/storage/batch.py", line 243, in _finish_futures
[2018-09-12 21:28:41,731] {base_task_runner.py:98} INFO - Subtask:     raise exceptions.from_http_response(exception_args)
[2018-09-12 21:28:41,731] {base_task_runner.py:98} INFO - Subtask: google.api_core.exceptions.InternalServerError: 500 BATCH contentid://None: Backend Error

如何在我的代码中添加截断的指数补偿?

1 个答案:

答案 0 :(得分:1)

批处理代码不包括重试本身,也不允许您确切地找到它内部的哪些请求失败。这意味着(a)您将必须自己重试,并且(b)您将必须重试整个批次。

通过使用retrying软件包,可以使

(a)更容易。 (b)没问题,因为删除Blob是幂等的。

放在一起,解决方案可能类似于:

def retriable_exception(e):
    return isinstance(e, GoogleAPICallError) and (e.code == 429 or e.code>=500)

@retry(retry_on_exception=retriable_exception,
       stop_max_attempt_number=7,
       wait_exponential_multiplier=1000,
       wait_exponential_max=10000)
def delete_batch(c):
    with storage_client.batch():
        for blob in c:
            blob.delete()

blobs_to_delete = [blob for blob in bucket.list_blobs(prefix="my/prefix/here")]

for c in _chunk(blobs, batch_size=100):
    delete_batch(c)