我正在尝试使用Python和Boto3列出S3存储桶中的所有目录。
我使用以下代码:
s3 = session.resource('s3') # I already have a boto3 Session object
bucket_names = [
'this/bucket/',
'that/bucket/'
]
for name in bucket_names:
bucket = s3.Bucket(name)
for obj in bucket.objects.all(): # this raises an exception
# handle obj
当我运行它时,我得到以下异常堆栈跟踪:
File "botolist.py", line 67, in <module>
for obj in bucket.objects.all():
File "/Library/Python/2.7/site-packages/boto3/resources/collection.py", line 82, in __iter__
for page in self.pages():
File "/Library/Python/2.7/site-packages/boto3/resources/collection.py", line 165, in pages
for page in pages:
File "/Library/Python/2.7/site-packages/botocore/paginate.py", line 83, in __iter__
response = self._make_request(current_kwargs)
File "/Library/Python/2.7/site-packages/botocore/paginate.py", line 155, in _make_request
return self._method(**current_kwargs)
File "/Library/Python/2.7/site-packages/botocore/client.py", line 270, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/Library/Python/2.7/site-packages/botocore/client.py", line 335, in _make_api_call
raise ClientError(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (NoSuchKey) when calling the ListObjects operation: The specified key does not exist.
在存储桶中列出目录的正确方法是什么?
非常感谢...
答案 0 :(得分:10)
或者,您可以使用boto3.client
实施例
>>> import boto3
>>> client = boto3.client('s3')
>>> client.list_objects(Bucket='MyBucket')
list_objects
还支持迭代结果可能需要的其他参数:Bucket,Delimiter,EncodingType,Marker,MaxKeys,Prefix
答案 1 :(得分:1)
我原本以为你不能在桶名中加斜杠。您说要列出存储桶中的所有目录,但您的代码会尝试列出多个存储桶中的所有内容(不一定是目录)。这些桶可能不存在(因为它们具有非法名称)。所以当你运行
bucket = s3.Bucket(name)
bucket可能为null,后续列表将失败。
答案 2 :(得分:0)
获取S3存储桶中具有特定前缀的所有对象的列表的最佳方法是使用list_objects_v2
和ContinuationToken
来克服1000个对象的分页限制。
import boto3
s3 = boto3.client('s3')
s3_bucket = 'your-bucket'
s3_prefix = 'your/prefix'
partial_list = s3.list_objects_v2(
Bucket=s3_bucket,
Prefix=s3_prefix)
obj_list = partial_list['Contents']
while partial_list['IsTruncated']:
next_token = partial_list['NextContinuationToken']
partial_list = s3.list_objects_v2(
Bucket=s3_bucket,
Prefix=s3_prefix,
ContinuationToken=next_token)
obj_list.extend(partial_list['Contents'])
答案 3 :(得分:0)
所有其他这些反应都很糟糕。使用
client.list_objects()
您最多只能搜索1k个结果。其余答案都是错误的或过于复杂。
自己处理延续令牌是一个可怕的想法。只需使用Paginator,它就会为您处理这种逻辑
您想要的解决方案是:
[e['Key'] for p in client.get_paginator("list_objects_v2")\
.paginate(Bucket='my_bucket')
for e in p['Contents']]
答案 4 :(得分:0)
如果文件夹中的对象少于1,000个,则可以使用以下代码:
import boto3
s3 = boto3.client('s3')
object_listing = s3.list_objects_v2(Bucket='bucket_name',
Prefix='folder/sub-folder/')