如何使用没有IOError的GCS路径加载文件?

时间:2019-06-21 09:07:00

标签: python nlp google-cloud-storage

在Google Cloud TPU上运行XLNet代码时遇到了一些问题。当我选择gs://{model_path}/...作为模型路径时,结果就是IOError

赞:

Traceback (most recent call last):
  File "run_classifier.py", line 903, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "run_classifier.py", line 722, in main
    sp.Load(FLAGS.spiece_model_file)
  File "/usr/local/lib/python2.7/dist-packages/sentencepiece.py", line 118, in Load
    return _sentencepiece.SentencePieceProcessor_Load(self, filename)
IOError: Not found: "gs://ykproject/pre-trained/xlnet_cased_L-24_H-1024_A-16/spiece.model": No such file or directory Error #2

原始代码是:

sp = spm.SentencePieceProcessor()
sp.Load(FLAGS.spiece_model_file)

我试图找出原因。因此,我决定将GCS文件加载到我的Python文件中:

f = open("gs://ykproject/test.txt". "r")

错误仍然存​​在:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 2] No such file or directory: 'gs://ykproject/test.txt'

1 个答案:

答案 0 :(得分:1)

似乎您正在尝试访问对象文件,就像它们在计算机文件系统上一样。要通过Python访问Cloud Storage上的对象,您需要实例化客户端,访问特定存储桶并获取对象的数据:

# Imports the Google Cloud client library
from google.cloud import storage

# Instantiates a client
storage_client = storage.Client()

# Instantiates the bucket
bucket = storage_client.get_bucket(bucket_name)

# Instantiates the object
blob = bucket.blob(source_blob_name)

# Optionally download the object into your file system
blob.download_to_filename(destination_file_name)

您可以找到有关下载Cloud Storage对象here

的更多信息。

此外,请确保您的Cloud TPU服务帐户有权访问Cloud Storage,如果没有,则可以使用“ gsutil” CLI工具更新权限。像这样(供阅读):

gsutil acl ch -u [SERVICE_ACCOUNT]:READER gs://[BUCKET_NAME]

或者这样(用于写作):

gsutil acl ch -u [SERVICE_ACCOUNT]:WRITER gs://[BUCKET_NAME]

有关here的信息。