我正在使用Sagemaker,并且有一堆model.tar.gz文件,需要将它们解压缩并加载到sklearn中。我一直在使用带定界符的list_objects来测试以获取tar.gz文件:
response = s3.list_objects(
Bucket = bucket,
Prefix = 'aleks-weekly/models/',
Delimiter = '.csv'
)
for i in response['Contents']:
print(i['Key'])
然后我打算用
提取import tarfile
tf = tarfile.open(model.read())
tf.extractall()
但是如何从s3而不是某个boto3对象获取实际的tar.gz文件?
答案 0 :(得分:1)
您可以使用s3.download_file()
将对象下载到文件中。这将使您的代码看起来像:
s3 = boto3.client('s3')
bucket = 'my-bukkit'
prefix = 'aleks-weekly/models/'
# List objects matching your criteria
response = s3.list_objects(
Bucket = bucket,
Prefix = prefix,
Delimiter = '.csv'
)
# Iterate over each file found and download it
for i in response['Contents']:
key = i['Key']
dest = os.path.join('/tmp',key)
print("Downloading file",key,"from bucket",bucket)
s3.download_file(
Bucket = bucket,
Key = key,
Filename = dest
)