为什么执行gcloud ml-engine作业时加速器不足?

时间:2017-08-19 20:45:31

标签: google-cloud-platform google-cloud-ml-engine

我正在尝试在Google Cloud中运行机器学习Jon,但它总是告诉我没有足够的加速器可用,我尝试使用参数----scale-tier=BASIC | BASIC_GPU | STANDARD_1 | PREMIUM_1。并且是相同的结果。

这是命令和结果:

gcloud ml-engine jobs submit training object_detection_`date +%s`     --job-dir=gs://${TRAIN_DIR}     --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz     --module-name object_detection.train     --region us-central1     --config ${PATH_TO_LOCAL_YAML_FILE}     --     --train_dir=gs://${TRAIN_DIR}     --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
ERROR: (gcloud.ml-engine.jobs.submit.training) RESOURCE_EXHAUSTED: Field: scale_tier Error: Insufficient accelerators are available in region us-central1 to schedule the job which requests 6 K80 accelerators. Please wait and try again or else try submitting your job to a different region.
- '@type': type.googleapis.com/google.rpc.BadRequest
  fieldViolations:
  - description: Insufficient accelerators are available in region us-central1 to
      schedule the job which requests 6 K80 accelerators. Please wait and try again
      or else try submitting your job to a different region.
    field: scale_tier

1 个答案:

答案 0 :(得分:8)

us-central1对GPU的需求量很大。我建议在us-east1中尽可能在RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.*)/(.*) mobile.php?id=$2 [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.*) grid.php?view=$1 [L] 开始工作,直到有更多GPU可用。