通过Python脚本使用Google BigQuery

时间:2015-11-03 17:04:59

标签: python google-bigquery google-cloud-storage

我想通过python脚本在BigQuery上完成一些非常简单的任务。我发现this package效果不佳。的确,当我尝试这段代码时:

from bigquery import get_client


project_id = 'txxxxxxxxxxxxxxxxxx9'
# Service account email address as listed in the Google Developers Console.
service_account = '7xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com'
# PKCS12 or PEM key provided by Google.
key = '/home/fxxxxxxxxxxxx/Dropbox/access_keys/google_storage/xxxxxxxxxxxxxxxxxxxxx.pem'
client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
# Submit an async query.
results = client.get_table_schema('newdataset', 'newtable2')

print('results')

我收到此错误:

/home/xxxxxx/anaconda3/envs/snakes/bin/python2.7 /home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py
Traceback (most recent call last):
  File "/home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py", line 9, in <module>
    client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 83, in get_client
    readonly=readonly)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 101, in _get_bq_service
    service = build('bigquery', 'v2', http=http)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/util.py", line 142, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 196, in build
    cache)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 242, in _retrieve_discovery_doc
    resp, content = http.request(actual_url)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 565, in new_request
    self._refresh(request_orig)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 835, in _refresh
    self._do_refresh_request(http_request)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 862, in _do_refresh_request
    body = self._generate_refresh_request_body()
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1541, in _generate_refresh_request_body
    assertion = self._generate_assertion()
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1670, in _generate_assertion
    private_key, self.private_key_password), payload)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/_pycrypto_crypt.py", line 121, in from_string
    pkey = RSA.importKey(parsed_pem_key)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 665, in importKey
    return self._importKeyDER(der)
  File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 588, in _importKeyDER
    raise ValueError("RSA key format is not supported")
ValueError: RSA key format is not supported

Process finished with exit code 1

我的问题:python中有一个教程,展示了如何与BigQuery轻松交流:从谷歌存储或S3导入数据集,查询内容,将结果导出到谷歌存储。

1 个答案:

答案 0 :(得分:3)

很大程度上取决于你的环境,一旦你发现一切都应该超级简单。我看到你粘贴的错误日志上唯一的问题是搞清楚身份验证。

Python pandas已经支持BigQuery了一段时间:

我和模块的创建者合作了一段视频:

现在,最简单,最快捷的方式是推出一款Jupyter笔记本,其中包含您提到的所有Google Cloud好东西,这是我们新推出的Google Datalab项目:

唯一的Datalab警告是它可以在云服务器上运行,但是如果你想要一个完全托管的Jupyter / IPython环境,那么完全安全,持久,并准备好处理BigQuery,存储等......试试吧。

同时,如果您正在编写Web应用程序,请查看其他Web应用程序如何解决此任务。

例如,re:用于连接BigQuery的破折号代码:

https://github.com/EverythingMe/redash/blob/master/redash/query_runner/big_query.py