无法读取源自Google表格的Bigquery表(宣誓/范围错误)

时间:2019-07-09 03:33:20

标签: google-sheets google-bigquery

import pandas as pd
from google.cloud import bigquery
import google.auth
# from google.cloud import bigquery

# Create credentials with Drive & BigQuery API scopes
# Both APIs must be enabled for your project before running this code
credentials, project = google.auth.default(scopes=[
    'https://www.googleapis.com/auth/drive',
    'https://www.googleapis.com/auth/spreadsheets',
    'https://www.googleapis.com/auth/bigquery',
])
client = bigquery.Client(credentials=credentials, project=project)

# Configure the external data source and query job
external_config = bigquery.ExternalConfig('GOOGLE_SHEETS')
# Use a shareable link or grant viewing access to the email address you
# used to authenticate with BigQuery (this example Sheet is public)
sheet_url = (
    'https://docs.google.com/spreadsheets'
    '/d/1uknEkew2C3nh1JQgrNKjj3Lc45hvYI2EjVCcFRligl4/edit?usp=sharing')
external_config.source_uris = [sheet_url]
external_config.schema = [
    bigquery.SchemaField('name', 'STRING'),
    bigquery.SchemaField('post_abbr', 'STRING')
]
external_config.options.skip_leading_rows = 1  # optionally skip header row
table_id = 'BambooHRActiveRoster'
job_config = bigquery.QueryJobConfig()
job_config.table_definitions = {table_id: external_config}

# Get Top 10
sql = 'SELECT * FROM workforce.BambooHRActiveRoster LIMIT 10'   

query_job = client.query(sql, job_config=job_config)  # API request

top10 = list(query_job)  # Waits for query to finish
print('There are {} states with names starting with W.'.format(
    len(top10)))

我得到的错误是:

BadRequest: 400 Error while reading table: workforce.BambooHRActiveRoster, error message: Failed to read the spreadsheet. Errors: No OAuth token with Google Drive scope was found.

我可以从通过CSV上传创建的BigQuery表中提取数据,但是当我有从链接的Google表格创建的BigQuery表时,我继续收到此错误。

我试图在Google文档中复制示例(创建和查询临时表):

https://cloud.google.com/bigquery/external-data-drive

4 个答案:

答案 0 :(得分:1)

您正在以自己的身份进行身份验证,如果您具有正确的权限,通常对BQ来说是好的。使用链接到Google表格的表格通常需要一个服务帐户。创建一个(或让您的BI / IT团队创建一个),然后您必须与服务帐户共享基础Google表格。最后,您将需要修改python脚本以使用服务帐户凭据而不是您自己的凭据。

解决此问题的快速方法是使用表格链接表中的BQ接口select *,并将结果保存到新表中,然后直接在python脚本中查询该新表。如果这是一次性上传/分析,则效果很好。如果工作表中的数据一直在变化,并且您需要例行查询数据,那么这不是一个长期解决方案。

答案 1 :(得分:0)

import pandas as pd
from google.oauth2 import service_account
from google.cloud import bigquery
#from oauth2client.service_account import ServiceAccountCredentials

SCOPES = ['https://www.googleapis.com/auth/drive','https://www.googleapis.com/auth/bigquery']
SERVICE_ACCOUNT_FILE = 'mykey.json'

credentials = service_account.Credentials.from_service_account_file(
        SERVICE_ACCOUNT_FILE, scopes=SCOPES)

delegated_credentials = credentials.with_subject('myserviceaccountt@domain.iam.gserviceaccount.com')

client = bigquery.Client(credentials=delegated_credentials, project=project)

sql =  'SELECT * FROM `myModel`'

DF = client.query(sql).to_dataframe()

答案 2 :(得分:0)

您可以尝试通过控制台更新默认凭据:

gcloud auth application-default login --scopes=https://www.googleapis.com/auth/userinfo.email,https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/cloud-platform

答案 3 :(得分:0)

我通过向客户端添加范围对象解决了问题。

from google.cloud import bigquery
import google.auth

credentials, project = google.auth.default(scopes=[
    'https://www.googleapis.com/auth/drive',
    'https://www.googleapis.com/auth/bigquery',
])
CLIENT = bigquery.Client(project='project', credentials=credentials)

https://cloud.google.com/bigquery/external-data-drive