从表中提取metadata_fields以及bigquery表中的数据列

时间:2019-09-20 10:19:22

标签: google-bigquery

我必须查询bigquery表,其中要求提取以下字段:

a)表元数据列(2列)

b)数据列(3列)

例如:

table_id ,   creation_time , col_id1 , col2_id2, col3_id3
tbl_20180424, 1524641477022,  1, 2, 3
tbl_20180524, 1524647897022,  11, 12, 13

我在下面的查询中进行了尝试,但是它不起作用:

SELECT
table_id,
creation_time,
year,
Week,
id1,
id2,
id3
FROM (
SELECT  
table_id,
TIMESTAMP_MILLIS(creation_time) AS creation_time,
EXTRACT(YEAR FROM CAST(TIMESTAMP_MILLIS(creation_time) as DATE)) Year,
EXTRACT(WEEK FROM CAST(TIMESTAMP_MILLIS(creation_time) as DATE)) Week
FROM `project.dataset.__TABLES_SUMMARY__`
where table_id like 'tbl_%' )

2 个答案:

答案 0 :(得分:2)

以下是用于BigQuery标准SQL

#standardSQL
WITH data AS (
  SELECT CONCAT('tbl_', _TABLE_SUFFIX) AS table_id, col_id1, col_id2, col_id3
  FROM `project.dataset.tbl_*`
), metadata AS (
  SELECT 
    table_id,
    TIMESTAMP_MILLIS(creation_time) AS creation_time,
    EXTRACT(YEAR FROM CAST(TIMESTAMP_MILLIS(creation_time) AS DATE)) Year,
    EXTRACT(WEEK FROM CAST(TIMESTAMP_MILLIS(creation_time) AS DATE)) Week
  FROM `project.dataset.__TABLES_SUMMARY__`
  WHERE table_id LIKE 'tbl_%'
)
SELECT *
FROM metadata
JOIN data
USING(table_id)  

如果适用于您的问题的简化示例-结果将为

Row table_id        creation_time               Year    Week    col_id1 col_id2 col_id3  
1   tbl_20180424    2019-09-20 19:21:38.600 UTC 2019    37      1       2       3    
2   tbl_20180524    2019-09-20 19:22:00.676 UTC 2019    37      11      12      13   

答案 1 :(得分:0)

在查询中,您尝试选择id1id2id3,但它们不在您的子查询中。另外,请考虑按以下方式研究INFORMATION_SCHEMA

with tables as (select table_name, creation_time from dataset.INFORMATION_SCHEMA.TABLES),
    columns as (select table_name, column_name, ordinal_position from dataset.INFORMATION_SCHEMA.COLUMNS)
select
  table_name,
  creation_time,
  EXTRACT(YEAR from creation_time) as Year,
  EXTRACT(WEEK from creation_time) as Week,
  column_name,
  ordinal_position
from tables
inner join columns using(table_name)
order by table_name, ordinal_position

此查询提供了数据集中所有表/列的未透视结果集。