我试图从S3存储桶中下载大量小文件 - 我是通过使用以下方法来执行此操作的:
s3 = boto3.client('s3')
kwargs = {'Bucket': bucket}
with open('/Users/hr/Desktop/s3_backup/files.csv','w') as file:
while True:
# The S3 API response is a large blob of metadata.
# 'Contents' contains information about the listed objects.
resp = s3.list_objects_v2(**kwargs)
try:
contents = resp['Contents']
except KeyError:
return
for obj in contents:
key = obj['Key']
file.write(key)
file.write('\n')
# The S3 API is paginated, returning up to 1000 keys at a time.
# Pass the continuation token into the next response, until we
# reach the final page (when this field is missing).
try:
kwargs['ContinuationToken'] = resp['NextContinuationToken']
except KeyError:
break
但是,在一段时间后我收到此错误消息' EndpointConnectionError:无法连接到端点URL'。
我知道s3存储桶上还有更多的文件。我有三个问题:
为什么在我没有下载存储桶中的所有文件时会发生此错误?
有没有办法从我从S3存储桶下载的最后一个文件启动我的代码(我不想重新下载我已经下载过的文件名)< / p>
是否存在S3存储桶的默认排序,是否按字母顺序排列?