当我不使用JOIN时,“加入太大”错误

时间:2014-08-27 14:43:44

标签: google-bigquery

select 
CASE 
    WHEN .....
    ELSE .....
END AS carrier,
count(vehicle_id) as cnt
from test.vehicle_info 
WHERE vehicle_id NOT IN(select hardware_id 
                        from TABLE_DATE_RANGE(test.gps32_,DATE_ADD(CURRENT_TIMESTAMP(), -6,     'DAY'),DATE_ADD(CURRENT_TIMESTAMP(), -1, 'DAY')))
group by carrier
order by cnt

我收到了这个错误:

Query Failed
Error: Table too large for JOIN. Consider using JOIN EACH. For more details, please see https://developers.google.com/bigquery/docs/query-reference#joins
Job ID: red-road-574:job_e2o6sBjO9Dt5QrU_cRM2VHSRTso

原因是什么以及如何解决?

1 个答案:

答案 0 :(得分:2)

@ Hobbs上面的猜测是正确的。 SEMIJOIN(使用WHERE ... IN ...)和ANTIJOIN(使用WHERE ... NOT IN ...)实现为JOIN操作。解决这些限制的方法是使用join EACH自己重写为连接。那就是:

select 
CASE 
    WHEN .....
    ELSE .....
END AS carrier,
count(vi.vehicle_id) as cnt
from test.vehicle_info vi
LEFT OUTER JOIN EACH (select hardware_id FROM TABLE_DATE_RANGE(...)) hi
ON vi.vechicle_id = hi.hardware_id
WHERE hi.hardware_id is NULL
group by carrier
order by cnt