Question

如何为Tensorflow设置直接私有存储桶访问权限？

运行后
from tensorflow.python.lib.io import file_io 并运行print file_io.stat('s3://my/private/bucket/file.json')，我最终遇到错误-
NotFoundError: Object s3://my/private/bucket/file.json does not exist

但是，公共对象上的同一行没有错误：
print file_io.stat('s3://ryft-public-sample-data/wikipedia-20150518.bin')

此处似乎有一篇有关支持的文章：https://github.com/tensorflow/examples/blob/master/community/en/docs/deploy/s3.md
但是，导出显示的变量后，我仍然遇到相同的错误。

我为awscli设置了所有凭据，并且boto3可以查看和下载有问题的文件。我想知道当存储桶为私有桶时如何才能使Tensorflow直接具有S3访问权限。

Answer 1

当尝试从Sagemaker笔记本访问私有S3存储桶中的文件时，我遇到了同样的问题。我犯的错误是尝试使用从boto3获得的凭据，该凭据在外部似乎无效。

解决方案不是指定凭据（在这种情况下，它使用连接到计算机的角色），而是仅指定区域名称（由于某种原因，它没有从~/.aws/config文件中读取它）如下：

import boto3
import os

session = boto3.Session()
os.environ['AWS_REGION']=session.region_name

注意：调试此错误时，很有用的是查看CloudWatch日志，因为S3客户端的日志仅打印在此处，而不打印在Jupyter笔记本中。在那儿，我首先看到的是：

当我确实从boto3指定凭据时，错误是：The AWS Access Key Id you provided does not exist in our records.
在没有设置AWS_REGION环境变量的情况下访问时，我有The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.，当您不指定存储桶时，这很常见（请参见301 Moved Permanently after S3 uploading）

Tensorflow-S3对象不存在

1 个答案: