如何在python中使用boto3给出文件路径从s3下载文件

时间:2018-04-14 07:01:27

标签: python amazon-s3 boto3

非常基本,但我无法下载给定s3路径的文件。

例如,我有s3://name1/name2/file_name.txt

import boto3
locations = ['s3://name1/name2/file_name.txt']
s3_client = boto3.client('s3')
bucket = 'name1'
prefix = 'name2'

for file in locations:
    s3_client.download_file(bucket, 'file_name.txt', 'my_local_folder')

我收到错误botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found

此文件存在,因为我下载。使用aws cli作为s3 path: s3://name1/name2/file_name.txt .

2 个答案:

答案 0 :(得分:4)

您需要有一个文件名路径列表,然后修改您的代码,如documentation所示:

import os
import boto3
import botocore

files = ['name2/file_name.txt']

bucket = 'name1'

s3 = boto3.resource('s3')

for file in files:
   try:
       s3.Bucket(bucket).download_file(file, os.path.basename(file))
   except botocore.exceptions.ClientError as e:
       if e.response['Error']['Code'] == "404":
           print("The object does not exist.")
       else:
           raise

答案 1 :(得分:0)

您可能需要使用某种身份验证来执行此操作。有几种方法,但是创建会话既简单又快速:

from boto3.session import Session

bucket_name = 'your_bucket_name'
folder_prefix = 'your/path/to/download/files'
credentials = 'credentials.txt'

with open(credentials, 'r', encoding='utf-8') as f:
    line = f.readline().strip()
    access_key = line.split(':')[0]
    secret_key = line.split(':')[1]

session = Session(
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key
)

s3 = session.resource('s3')
bucket = s3.Bucket(bucket_name)

for s3_file in bucket.objects.filter(Prefix=folder_prefix):
    file_object = s3_file.key
    file_name = str(file_object.split('/')[-1])
    print('Downloading file {} ...'.format(file_object))
    bucket.download_file(file_object, '/tmp/{}'.format(file_name))

credentials.txt文件中,必须在连接访问密钥ID和密钥的位置添加一行,例如:

~$ cat credentials.txt
AKIAIO5FODNN7EXAMPLE:ABCDEF+c2L7yXeGvUyrPgYsDnWRRC1AYEXAMPLE

不要忘记在主机上很好地保护此文件,请为运行此程序的用户提供只读权限。我希望它对您有用,对我也非常有用。