使用本地工作站查询公共BigQuery数据的麻烦

时间:2019-03-01 10:24:21

标签: python google-bigquery google-colaboratory kaggle

我正在尝试从我的colab上的BigQuery API(以太坊数据集)中查询公共数据。

我已经尝试过了

from google.colab import auth
auth.authenticate_user()
from google.cloud import bigquery
eth_project_id = 'crypto_ethereum_classic'
client = bigquery.Client(project=eth_project_id)

并收到此错误消息:

WARNING:google.auth._default:No project ID could be determined. Consider running `gcloud config set project` or setting the GOOGLE_CLOUD_PROJECT environment variable

我也尝试使用BigQueryHelper库并收到类似的错误消息

from bq_helper import BigQueryHelper
eth_dataset = BigQueryHelper(active_project="bigquery-public-data",dataset_name="crypto_ethereum_classic") 

错误:

WARNING:google.auth._default:No project ID could be determined. Consider running `gcloud config set project` or setting the GOOGLE_CLOUD_PROJECT environment variable
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-21-53ac8b2901e1> in <module>()
      1 from bq_helper import BigQueryHelper
----> 2 eth_dataset = BigQueryHelper(active_project="bigquery-public-data",dataset_name="crypto_ethereum_classic")

/content/src/bq-helper/bq_helper.py in __init__(self, active_project, dataset_name, max_wait_seconds)
     23         self.dataset_name = dataset_name
     24         self.max_wait_seconds = max_wait_seconds
---> 25         self.client = bigquery.Client()
     26         self.__dataset_ref = self.client.dataset(self.dataset_name, project=self.project_name)
     27         self.dataset = None

/usr/local/lib/python3.6/dist-packages/google/cloud/bigquery/client.py in __init__(self, project, credentials, _http, location, default_query_job_config)
    140     ):
    141         super(Client, self).__init__(
--> 142             project=project, credentials=credentials, _http=_http
    143         )
    144         self._connection = Connection(self)

/usr/local/lib/python3.6/dist-packages/google/cloud/client.py in __init__(self, project, credentials, _http)
    221 
    222     def __init__(self, project=None, credentials=None, _http=None):
--> 223         _ClientProjectMixin.__init__(self, project=project)
    224         Client.__init__(self, credentials=credentials, _http=_http)

/usr/local/lib/python3.6/dist-packages/google/cloud/client.py in __init__(self, project)
    176         if project is None:
    177             raise EnvironmentError(
--> 178                 "Project was not passed and could not be "
    179                 "determined from the environment."
    180             )

OSError: Project was not passed and could not be determined from the environment.

只需重申一下,我正在使用Colab-我知道如何在Kaggle上查询数据,但是需要在我的colab上进行查询

1 个答案:

答案 0 :(得分:0)

IN Colab-您需要先进行身份验证。

从google.colab导入身份验证 auth.authenticate_user()

这将对您的用户帐户进行项目身份验证。