即使存在数据,Airflow GoogleCloudStorageHook也会抛出404错误

时间:2018-11-01 17:24:01

标签: airflow

数据存在于存储桶中。但是api调用说该文件不存在。我注意到的重要一件事是,由钩子构造的GCS对象路径在解析时似乎有一个双斜杠。

   [2018-11-01 11:55:47,836] {discovery.py:267} INFO - URL being requested: GET https://www.googleapis.com/discovery/v1/apis/storage/v1/rest
    [2018-11-01 11:55:49,422] {discovery.py:866} INFO - URL being requested: GET https://www.googleapis.com/storage/v1/b/dev/o/%2Fdaily_exports%2F20181030%2Fdata.gz?alt=media
    Traceback (most recent call last):
      File "/home/user/airflow/dags/custom/es_operator.py", line 51, in execute
        hook.download(self.bucket,self.object,self.local_path)
      File "/home/user/airflow/venv/lib/python3.6/site-packages/airflow/contrib/hooks/gcs_hook.py", line 162, in download
        .get_media(bucket=bucket, object=object) \
      File "/home/user/airflow/venv/lib/python3.6/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
        return wrapped(*args, **kwargs)
      File "/home/user/airflow/venv/lib/python3.6/site-packages/googleapiclient/http.py", line 842, in execute
        raise HttpError(resp, content, uri=self.uri)
    googleapiclient.errors.HttpError: <HttpError 404 when requesting https://www.googleapis.com/storage/v1/b/lotlinx-dev/o/%2Fdaily_exports%2F20181030%2Fdata.gz?alt=media returned "Not Found">

    During handling of the above exception, another exception occurred:

因此,当我单击api构建的链接时,它看起来像这样

Anonymous caller does not have storage.objects.get access to dev//daily_exports/20181030/data.gz.

双斜杠是由钩子生成的,我对此没有任何控制权。感谢您的帮助

0 个答案:

没有答案