Question

我正在使用AWS Sagemaker并尝试从Sagemaker将数据文件夹上传到S3。我想做的是将我的数据上传到s3_train_data目录（该目录存在于S3中）。但是，它不会将其上传到该存储桶中，而是存储在已创建的默认存储桶中，然后使用S3_train_data变量创建新的文件夹目录。

在目录

中输入的代码

import os
import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()
role = get_execution_role()

bucket = <bucket name>
prefix = <folders1/folders2>
key = <input>


s3_train_data = 's3://{}/{}/{}/'.format(bucket, prefix, key)


#path 'data' is the folder in the Jupyter Instance, contains all the training data
inputs = sagemaker_session.upload_data(path= 'data', key_prefix= s3_train_data)

代码中的问题或更多是我创建笔记本的方法吗？

Answer 1

您可以查看Sample笔记本，如何上传数据S3存储桶有很多方法。我只是给你提示回答。你忘了创建一个boto3会话来访问S3存储桶

这是实现目标的方法之一。

import os 
import urllib.request
import boto3

def download(url):
    filename = url.split("/")[-1]
    if not os.path.exists(filename):
        urllib.request.urlretrieve(url, filename)


def upload_to_s3(channel, file):
    s3 = boto3.resource('s3')
    data = open(file, "rb")
    key = channel + '/' + file
    s3.Bucket(bucket).put_object(Key=key, Body=data)


# caltech-256
download('http://data.mxnet.io/data/caltech-256/caltech-256-60-train.rec')
upload_to_s3('train', 'caltech-256-60-train.rec')
download('http://data.mxnet.io/data/caltech-256/caltech-256-60-val.rec')
upload_to_s3('validation', 'caltech-256-60-val.rec')

link：https://buildcustom.notebook.us-east-2.sagemaker.aws/notebooks/sample-notebooks/introduction_to_amazon_algorithms/imageclassification_caltech/Image-classification-fulltraining.ipynb

另一种方法。

bucket = '<your_s3_bucket_name_here>'# enter your s3 bucket where you will copy data and model artifacts
prefix = 'sagemaker/breast_cancer_prediction' # place to upload training files within the bucket
# do some processing then prepare to push the data. 

f = io.BytesIO()
smac.write_numpy_to_dense_tensor(f, train_X.astype('float32'), train_y.astype('float32'))
f.seek(0)

boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'train', train_file)).upload_fileobj(f)

链接：https://buildcustom.notebook.us-east-2.sagemaker.aws/notebooks/sample-notebooks/introduction_to_applying_machine_learning/breast_cancer_prediction/Breast%20Cancer%20Prediction.ipynb

Youtube链接：https://www.youtube.com/watch?v=-YiHPIGyFGo - 如何在S3存储桶中提取数据。

AWS将文件上传到错误的存储桶

1 个答案: