UNION ALL参数化查询

时间:2019-04-02 20:10:18

标签: python sql google-bigquery

我有一个很好的查询。问题在于该查询的一部分是需要从文件中读取的字符串。查询每个字符串会产生6个输出。我需要该文件的所有结果的并集,以便最终结果是6x字符串数的表。我可以使用Python读取文件。

我已经尝试使用参数化查询。它们每个都仅基于字符串返回6行。

我的大多数Python代码都基于BigQuery的文档here

query = """
    SELECT pet_id, age, name
    FROM `myproject.mydataset.mytable`
    WHERE name = @name
    AND species = @species;
"""
query_params = [
    bigquery.ScalarQueryParameter('name', 'STRING', 'Max'),
    bigquery.ScalarQueryParameter('species', 'INT64', 'Dog'), 
    bigquery.ScalarQueryParameter('name', 'STRING', 'Alfred'), 
    bigquery.ScalarQueryParameter('species', 'INT64', 'Cat')
]
job_config = bigquery.QueryJobConfig()
job_config.query_parameters = query_params
query_job = client.query(
    query,
    # Location must match that of the dataset(s) referenced in the query.
    location='US',
    job_config=job_config)  # API request - starts the query

# Print the results
for row in query_job:
    print('{}: \t{}'.format(row.word, row.word_count))

如何获得所有这些查询结果的UNION ALL?

输出应类似于

pet_id | age | name
___________________
1      | 5   | Max
2      | 8   | Alfred

1 个答案:

答案 0 :(得分:1)

请使用公共数据查看以下示例(您也可以运行查询)

#standardSQL
SELECT * 
FROM `bigquery-public-data.baseball.schedules`
WHERE (year, duration_minutes) IN UNNEST([(2016, 187), (2016, 165), (2016, 189)])

这里的关键是让您提供一个要用来过滤表的值数组,并使用 IN UNNEST(array_of_values)来完成工作,理想情况如下:

query = """
    SELECT pet_id, age, name
    FROM `myproject.mydataset.mytable`
    WHERE (name, species) IN UNNEST(@filter_array);
"""

不幸的是,BigQuery Python API不允许您将array< struct<string, int64> >指定为查询参数。因此,您可能必须这样做:

query = """
    SELECT pet_id, age, name
    FROM `myproject.mydataset.mytable`
    WHERE concat(name, "_", species) IN UNNEST(@filter_array);
"""
array_of_pre_concatenated_name_and_species = ['Max_Dog', 'Alfred_Cat']
query_params = [
    bigquery.ArrayQueryParameter('filter_array', 'STRING', array_of_pre_concatenated_name_and_species),
]