Error when submitting training job to gcloud

时间:2018-04-20 00:48:05

标签: tensorflow machine-learning gcloud google-cloud-ml

I am new to training on gcloud.

When I am running the training job, I get the following error

(gcloud.ml-engine.jobs.submit.training) Could not copy [research/dist/object_detection-0.1.tar.gz] to [training/packages/c5292b23e57f357dc2d63baab473c04337dbadd2deeb10965e743cd8422b964f/object_detection-0.1.tar.gz]. Please retry: HTTPError 404: Not Found

I am using this to run the training job

gcloud ml-engine jobs submit training job1 \     
--job-dir=gs://${ml-project-neu}/training \
--packages research/dist/object_detection-0.1.tar.gz,research/slim/dist/slim-0.1.tar.gz \
--module-name object_detection.train \
--config cloud.yml \
--runtime-version=1.4
-- \
--train_dir=gs://${ml-project-neu}/training \
--pipeline_config_path=gs://${ml-project-neu}/data/faster_rcnn_inception_v2_pets.config

1 个答案:

答案 0 :(得分:2)

确保$ {ml-poject-neu}有效(在您的情况下可能是空字符串);确保gs:// $ {ml-project-neu}存在。并确保您使用gcloud的凭据可以访问您的GCS存储桶(考虑运行gcloud auth登录)。