如何在Big Query的现有表格列中为每个类别创建多个表格?

时间:2019-01-28 08:45:34

标签: google-bigquery

我现有的表格列中有类别列,我需要为每个唯一的类别创建一个单独的表。如何在Big Query中做到这一点?

1 个答案:

答案 0 :(得分:0)

您可以将Python SDK用于BigQuery。这是您可以使用的示例:

from google.cloud import bigquery

#Initiate the client
client = bigquery.Client()

#Load the dataset in which you are going to store the table
dataset_ref = client.dataset('{dataset}')

#Create the Query which is going to get the values
QUERY = (
    'SELECT DISTINCT {column} FROM `{project-ID}.{dataset}.{table}`'
)

# API request
query_job = client.query(QUERY)

# Wait for the Query to finish and store the query.
rows = query_job.result()

# Define the schema of the new table
schema = [
    bigquery.SchemaField('ExampleField', 'STRING', mode='REQUIRED')
]

#The query is returned as a tuple per row
for value in rows:
        table_ref = dataset_ref.table(value[0])
        table = bigquery.Table(table_ref, schema=schema)
        table = client.create_table(table)
        assert table.table_id == value[0]

请记住,表名只能包含字母(大写或小写),数字和下划线。因此,如果尝试使用无效值,您可能会遇到异常。