如果带有连接的查询未展平,则BigQuery会引发异常

时间:2016-01-18 08:19:00

标签: google-bigquery

查询是:

select
    cb.subnum as subnum,
    last(
        if(
            (if(cu.smartcard_number is not null, 1, 0)) +
            (if(rr.smart_card_number is not null, 1, 0)) > 0, 1, 0)   
    ) as econnected_i,

from 
    combined.table1 as cb

left outer join each dataflow_raw_eu.table2 as cu
on cu.smartcard_number = cb.smart_card_num

left outer join each dataflow_raw_eu.table3 as rr
on rr.smart_card_number = cb.smart_card_num

group by subnum

错误是:

  

错误:字段名称不明确' imported_at'在加入。请在字段名称前使用表限定符。

我注意到当它只使用一个表的一个连接运行时,查询成功。 imported_at是所有3个表共享的时间戳字段(所有3个表共享的唯一字段),但它不包含在查询中。

如果我在BigQuery选项中选择flatten_results,则查询成功;但我希望用嵌套记录运行未来的查询。上述查询中的所有表都没有重复或记录字段。

1 个答案:

答案 0 :(得分:2)

看起来这可能是GBQ bug

请尝试以下解决方法

SELECT
    cb.subnum AS subnum,
    LAST(
        IF(
            (IF(cu.smartcard_number IS NOT NULL, 1, 0)) +
            (IF(rr.smart_card_number IS NOT NULL, 1, 0)) > 0, 1, 0)   
    ) AS econnected_i,

FROM 
    combined.table1 AS cb

LEFT OUTER JOIN EACH (SELECT smartcard_number FROM dataflow_raw_eu.table2) AS cu
ON cu.smartcard_number = cb.smart_card_num

LEFT OUTER JOIN EACH (SELECT smart_card_number FROM dataflow_raw_eu.table3) AS rr
ON rr.smart_card_number = cb.smart_card_num

GROUP BY subnum  

请注意,取决于dataflow_raw_eu.table2dataflow_raw_eu.table3中数据的逻辑和性质,您可以考虑在子选择中使用GROUP BY,如下所示

SELECT smartcard_number FROM dataflow_raw_eu.table2 GROUP BY smartcard_number   

SELECT smart_card_number FROM dataflow_raw_eu.table3 GROUP BY smart_card_number