ML Engine BigQuery:请求的身份验证范围不足

时间:2017-09-12 14:55:03

标签: authentication google-bigquery google-cloud-ml-engine

我正在运行提交 ml引擎培训的tensorflow模型。我已经构建了一个管道,它使用 tf.contrib.cloud.python.ops.bigquery_reader_ops.BigQueryReade 作为读者,从 BigQuery 读取的队列即可。

使用 DataLab 本地,一切正常,设置 GOOGLE_APPLICATION_CREDENTIALS 变量指向凭据密钥的json文件。但是,当我在云端提交培训工作时,我会收到这些错误(我只发布了两个主要错误):

  1. 权限被拒绝:在阅读...的架构时执行HTTP请求时出错(HTTP响应代码403,错误代码0,错误消息'')

  2. 创建模型时出错。检查详细信息:请求的身份验证范围不足。

  3. 我已经检查了其他所有内容,比如在脚本和project / dataset / table ids / names中正确定义表模式

    我在这里粘贴了日志中存在的整个错误,以便更清晰:

    消息:“Traceback(最近一次呼叫最后一次):

    File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
        "__main__", fname, loader, pkg_name)
    
    File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
        exec code in run_globals
    
    File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 131, in <module>
        hparams=hparam.HParams(**args.__dict__)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 210, in run
        return _execute_schedule(experiment, schedule)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 47, in _execute_schedule
        return task()
    
     File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 495, in train_and_evaluate
        self.train(delay_secs=0)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 275, in train
        hooks=self._train_monitors + extra_hooks)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 665, in _call_train
        monitors=hooks)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.py", line 289, in new_func
        return func(*args, **kwargs)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 455, in fit
        loss = self._train_model(input_fn=input_fn, hooks=hooks)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1007, in _train_model
        _, loss = mon_sess.run([model_fn_ops.train_op, model_fn_ops.loss])
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 521, in __exit__
        self._close_internal(exception_type)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 556, in _close_internal
        self._sess.close()
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 791, in close
        self._sess.close()
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 888, in close
        ignore_live_threads=True)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
        six.reraise(*self._exc_info_to_raise)
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/queue_runner_impl.py", line 238, in _run
        enqueue_callable()
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1063, in _single_operation_run
        target_list_as_strings, status, None)
    
    File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
        self.gen.next()
    
    File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
        pywrap_tensorflow.TF_GetCode(status))
    PermissionDeniedError: Error executing an HTTP request (HTTP response code 403, error code 0, error message '')
         when reading schema for pasquinelli-bigdata:Transactions.t_11_Hotel_25_w_train@1505224768418
         [[Node: GenerateBigQueryReaderPartitions = GenerateBigQueryReaderPartitions[columns=["F_RACC_GEST", "LABEL", "F_RCA", "W24", "ETA", "W22", "W23", "W20", "W21", "F_LEASING", "W2", "W16", "WLABEL", "SEX", "F_PIVA", "F_MUTUO", "Id_client", "F_ASS_VITA", "F_ASS_DANNI", "W19", "W18", "W17", "PROV", "W15", "W14", "W13", "W12", "W11", "W10", "W7", "W6", "W5", "W4", "W3", "F_FIN", "W1", "ImpTot", "F_MULTIB", "W9", "W8"], dataset_id="Transactions", num_partitions=1, project_id="pasquinelli-bigdata", table_id="t_11_Hotel_25_w_train", test_end_point="", timestamp_millis=1505224768418, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
    

    任何建议都非常有用,因为我对GC比较新。 谢谢大家。

1 个答案:

答案 0 :(得分:0)

支持从Cloud ML Engine读取BigQuery数据仍在开发中,因此您目前所做的工作不受支持。你遇到的问题是ML Engine运行的机器没有与BigQuery交流的合适范围。您可能在本地运行时遇到的潜在问题是从BigQuery读取性能不佳。这是需要解决的两个工作范例。

与此同时,我建议将数据导出到GCS进行培训。这将更具可扩展性,因此您不必担心随着数据的增加而导致培训效果不佳。这可以是一个很好的模式,它可以让你对数据进行一次预处理,以CSV格式将结果写入GCS,然后进行多次训练以尝试不同的算法或超参数。