我正在尝试通过任务队列和20个后端实例将相当大量的数据从GCS迁移到AppEngine。问题是新的云存储库似乎不尊重urlfetch超时,或其他事情正在发生。
import cloudstorage as gcs
gcs.set_default_retry_params(gcs.RetryParams(urlfetch_timeout=60,
max_retry_period=300))
...
with gcs.open(fn, 'r') as fp:
raw_gcs_file = fp.read()
因此,当队列暂停时,以下工作正常,我一次运行一个任务,但是当我尝试对20个后端运行20个并发任务时,会发生以下事件:
I 2013-07-20 00:18:16.418 Got exception while contacting GCS. Will retry in 0.2 seconds.
I 2013-07-20 00:18:16.418 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:21.553 Got exception while contacting GCS. Will retry in 0.4 seconds.
I 2013-07-20 00:18:21.554 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:25.728 Got exception while contacting GCS. Will retry in 0.8 seconds.
I 2013-07-20 00:18:25.728 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:31.428 Got exception while contacting GCS. Will retry in 1.6 seconds.
I 2013-07-20 00:18:31.428 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:34.301 Got exception while contacting GCS. Will retry in -1 seconds.
I 2013-07-20 00:18:34.301 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:34.301 Urlfetch retry 5 failed after 22.8741798401 seconds total
22秒后怎么会失败?它似乎根本没有使用重试参数。
答案 0 :(得分:1)
这是gcs客户端库中的一个错误。它很快就会修好。谢谢!
你黑客会工作。但如果它仍然经常超时,你可以尝试做fp.read(size = some_size)。如果文件很大,响应为32 MB(URLfetch响应大小限制)和90秒截止时间,我们假设传输速率为364KB / s。