在bigquery中枢转多层嵌套字段

时间:2019-11-03 20:27:32

标签: google-bigquery

我的bq表架构:

enter image description here

继续该帖子:bigquery pivoting with nested field 我正试图弄平这张桌子。我想取消嵌套timeseries.data字段,即最终的行数应等于timeseries.data数组的总长度。我还想添加带有某些值的annotation.properties.key作为附加列,并以annotation.properties.value作为其值。因此,在这种情况下,它将是“保证金”列。但是,以下查询给我错误:“无法识别的名称:数据”。但是在最后一个FROM之后,我已经做了:unnest(timeseries.data)作为数据。

flow_timestamp, channel_name, number_of_digits, timestamp, value, margin
2019-10-31 15:31:15.079674 UTC, channel_1, 4, 2018-02-28T02:00:00, 50, 0.01

查询:

SELECT 
  flow_timestamp, timeseries.channel_name, 

  ( SELECT MAX(IF(channel_properties.key = 'number_of_digits', channel_properties.value, NULL)) 
    FROM UNNEST(timeseries.channel_properties) AS channel_properties
  ),
  data.timestamp ,data.value

,(with subq as (select * from unnest(data.annotation))
select max(if (properties.key = 'margin', properties.value, null))
from (
select * from unnest(subq.properties)
) as properties
) as margin

FROM my_table
left join unnest(timeseries.data) as data

WHERE DATE(flow_timestamp) between "2019-10-28" and "2019-11-02" 
order by flow_timestamp

1 个答案:

答案 0 :(得分:0)

尝试以下

#standardSQL
SELECT 
  flow_timestamp, 
  timeseries.channel_name, 
  ( SELECT MAX(IF(channel_properties.key = 'number_of_digits', channel_properties.value, NULL)) 
    FROM UNNEST(timeseries.channel_properties) AS channel_properties
  ) AS number_of_digits, 
  item.timestamp, 
  item.value, 
  ( SELECT MAX(IF(prop.key = 'margin', prop.value, NULL)) 
    FROM UNNEST(item.annotation) AS annot, UNNEST(annot.properties) prop
  ) AS margin  
FROM my_table 
LEFT JOIN UNNEST(timeseries.data) item
WHERE DATE(flow_timestamp) BETWEEN '2019-10-28' AND '2019-11-02' 
ORDER BY flow_timestamp

下面是相同解决方案的详细程度稍低的版本,但是我通常更喜欢上面的版本,因为它更易于维护

#standardSQL
SELECT 
  flow_timestamp, 
  timeseries.channel_name, 
  ( SELECT MAX(IF(key = 'number_of_digits', value, NULL)) 
    FROM UNNEST(timeseries.channel_properties) AS channel_properties
  ) AS number_of_digits, 
  timestamp, 
  value, 
  ( SELECT MAX(IF(key = 'margin', value, NULL)) 
    FROM UNNEST(annotation), UNNEST(properties) 
  ) AS margin  
FROM my_table 
LEFT JOIN UNNEST(timeseries.data)   
WHERE DATE(flow_timestamp) BETWEEN '2019-10-28' AND '2019-11-02' 
ORDER BY flow_timestamp