SQL和BigQuery字段选择

时间:2019-06-25 15:00:28

标签: google-bigquery

以下缩短代码请求表并输出表模式:

# pip install google-cloud-bigquery
from google.cloud import bigquery
client = bigquery.Client()
dataset_ref = client.dataset("chicago_crime", project="bigquery-public-data")
dataset = client.get_dataset(dataset_ref)
table_ref = dataset_ref.table("crime")
table = client.get_table(table_ref)
table.schema[:4]

输出:

[SchemaField('unique_key', 'INTEGER', 'REQUIRED', 'Unique identifier for the record.', ()),
 SchemaField('case_number', 'STRING', 'NULLABLE', 'The Chicago Police Department RD Number (Records Division Number), which is unique to the incident.', ()),
 SchemaField('date', 'TIMESTAMP', 'NULLABLE', 'Date when the incident occurred. this is sometimes a best estimate.', ()),
 SchemaField('block', 'STRING', 'NULLABLE', 'The partially redacted address where the incident occurred, placing it on the same block as the actual address.', ())

列出字段(1,3)的代码如下:

from operator import itemgetter
fields_list=itemgetter(1,3)(table.schema)
client.list_rows(table, selected_fields=fields_list, max_results=5).to_dataframe()

输出:

    case_number block
0   JC299491    114XX S CHAMPLAIN AVE
1   JC273204    053XX N LOWELL AVE

如何明确指出字段名称?

fields_list=['case_number', 'block']

1 个答案:

答案 0 :(得分:2)

您可以创建架构字段名称与其各自的SchemaField对象的反向映射。

类似的东西:

schema_fields = dict((s.name.lower(), s) for s in table.schema)

有了这个,您可以用它们的名字来挑选字段:

fields_list = ['case_number', 'block']
selected_fields = map(lambda n: schema_fields[n], fields_list)
client.list_rows(table, selected_fields=selected_fields, max_results=5).to_dataframe()