我是python的新手,试图了解如何使用python访问S3 bucket文件夹结构中的文件。我应该指定bucket_key ="路径/文件夹/文件"这样的事情?请帮忙
我主要是尝试从csv文件中获取行数。但是我在阅读文件时遇到错误。
import os
import sys
import string
import urllib
import urllib2
import boto
import boto.cloudformation
import boto.exception
import boto.sns
import logging
from boto.s3.key import Key
from boto.s3.connection import S3Connection
#?pass this as a parameter?
bucket_name = "reporting"
bucket_key = "/compliance/testfile.csv"
def read_contents(bucket_key):
# connect to the bucket
conn = boto.connect_s3()
bucket = conn.get_bucket(bucket_name)
key = bucket_key
# create a key to keep track
k = Key(bucket)
k.key=key
testfile = k.get_contents_as_string()
return testfile
test = read_contents(bucket_key)
print test
答案 0 :(得分:-2)
嗨,让我们定义一些从AWS S3访问和下载文件的功能。 首先,您必须连接到亚马逊。
import boto
from boto.s3.connection import S3Connection
def s3_conn(conf):
""" Connect to S3
:param conf - dict: contains AWS credentials
:return S3Connection:
"""
try:
s3 = boto.connect_s3()
return s3
except:
key_id = conf.get('AWS_ACCESS_KEY_ID')
access_key = conf.get('AWS_SECRET_ACCESS_KEY')
return S3Connection(key_id, access_key)
conf
对象引用包含您的AWS凭据的字典。
完成此操作后,我们可以专注于下载步骤。
import os
from boto.s3.key import Key
def download_file_s3(filename, dirs3, output_path, buckets3, conf):
"""
:param filename - str: filename
:param dirs3 - str: full path to file
:param output_path - str: output path
:param buckets3 - str: bucket name
:param conf - dict: contains AWS credentials
"""
print('Downloading file from s3, filename={}, output_path={}, dirs3={}, buckets3={}'.format(
filename, output_path, dirs3, buckets3))
filepath = '/'.join([dirs3, filename])
s3 = s3_conn(conf)
bucket = s3.get_bucket(buckets3)
key = bucket.get_key(filepath)
key.get_contents_to_filename(os.path.join(output_path, filename))
print('File saved, output path={}'.format(os.path.join(output_path, filename)))
key.close()
s3.close()
download_file_s3
将filename
作为参数,dirs3
对应于文件的完整路径,您还可以设置output_path
,{{1}的名称最后,您必须提供包含您的AWS凭据的字典。
因此,此函数确定您的文件路径,连接到Amazon S3,获取所需的存储桶,然后下载您的文件。
如果您想从S3安全下载文件,只需使用此功能:
buckets3
在此示例中,from boto.exception import S3ResponseError
def safe_download_from_s3(filename, output_path, buckets3, dirs3, conf):
"""
:param filename - str: filename
:param dirs3 - str: full path to file
:param output_path - str: output path
:param buckets3 - str: bucket name
:param conf - dict: contains AWS credentials
"""
print('Trying to download file from s3, filename={}, output_path={}, dirs3={}, buckets3={}'.format(
filename, output_path, dirs3, buckets3))
try:
download_file_s3(filename, dirs3, output_path, buckets3, conf)
print('File downloaded successfully')
except S3ResponseError as err:
print('An S3ResponseError occurred while downloading, err={}'.format(err))
except TypeError as err:
print('A TypeError occurred while downloading, err={}'.format(err))
except NameError as err:
print('A NameError occurred while downloading, err={}'.format(err))
except:
print('Unexpected error, exec_info={}'.format(sys.exc_info()[0]))
是这样的字典:
conf
但您也可以将凭据导出到环境变量中:
conf = {'AWS_ACCESS_KEY_ID':'<your_aws_access_key_id>',
'AWS_SECRET_ACCESS_KEY':'<your_aws_secret_access>'}
只是这样做:
export AWS_ACCESS_KEY_ID=<your_aws_access_key_id>
export AWS_SECRET_ACCESS_KEY=<your_aws_secret_access>
这就是我向你推荐的。
我希望这会有所帮助。如果您有任何疑问,欢迎您。