BigQuery:使用旧SQL过滤重复的字段

时间:2016-12-12 10:00:32

标签: google-bigquery

我有下表:

row | query_params | query_values
1     foo            bar  
      param          val
2     foo            baz 

JSON:

{ 
"query_params" : [ "foo", "param"], 
"query_values" : [ "bar", "val" ] 
}, { 
"query_params" : [ "foo" ], 
"query_values" : [ "baz" ] 
}

使用旧版SQL我想在其值上过滤重复字段,例如

SELECT * FROM table WHERE query_params = 'foo'

哪个会输出

row | query_params | query_values
1     foo            bar  
2     foo            baz       

PS:此问题与同一问题有关,但使用标准SQL回答here

2 个答案:

答案 0 :(得分:0)

除了在分别展平每个数组后使用JOIN之外,我无法想到遗留SQL的任何更好的想法。如果您的表格T包含上述内容,则可以执行以下操作:

SELECT
  [t1.row],
  t1.query_params,
  t2.query_values
FROM
  FLATTEN((SELECT [row], query_params, POSITION(query_params) AS pos
           FROM T WHERE query_params = 'foo'), query_params) AS t1
JOIN
  FLATTEN((SELECT [row], query_values, POSITION(query_values) AS pos
           FROM T), query_values) AS t2
ON [t1.row] = [t2.row] AND
  t1.pos = t2.pos;

我们的想法是在对query_params进行过滤后,按行和位置关联两个数组的元素。

答案 1 :(得分:0)

尝试以下版本

    SELECT [row], query_params, query_values 
    FROM (
        SELECT [row], query_params, param_pos, query_values, POSITION(query_values) AS value_pos 
        FROM FLATTEN((
            SELECT [row], query_params, POSITION(query_params) AS param_pos, query_values 
            FROM YourTable
        ), query_params)
        WHERE query_params = 'foo'
    )
    WHERE param_pos = value_pos