我有一个带有以下各列的基表XYZ:
idCustomer,idDevice,日期,visit_time
我还有另外两个表A和B(相同),其列如下:
idCustomer,idDevice,日期,visit_time,channel_name,medium_name
我想将XYZ的A和B列加入: idCustomer,idDevice和visit_time
获取列: channel_name,medium_name (来自A或B)
这是我一直在努力的棘手部分。
我想从表A获取 channel_name,medium_name ,如果
XYZ.idCustomer = A.idCustomer和XYZ.visit_time = A.visit_time 对于任何给定的 idCustomer
如果没有匹配项,那么我想从B获取 channel_name,medium_name
如果 XYZ.idDevice = B.idDevice和XYZ.visit_time = B.visit_time
那是我能解释的最好的。任何帮助将非常感激 。
答案 0 :(得分:2)
以下是用于BigQuery标准SQL
#standardSQL
SELECT
t.idCustomer,
t.idDevice,
t.visit_time,
CASE
WHEN NOT a.idCustomer IS NULL THEN a.channel_name
WHEN NOT b.idDevice IS NULL THEN b.channel_name
END channel_name,
CASE
WHEN NOT a.idCustomer IS NULL THEN a.medium_name
WHEN NOT b.idDevice IS NULL THEN b.medium_name
END medium_name
FROM `project.dataset.XYZ` t
LEFT JOIN `project.dataset.A` a
ON t.idCustomer = a.idCustomer AND t.visit_time = a.visit_time
LEFT JOIN `project.dataset.B` b
ON t.idDevice = b.idDevice AND t.visit_time = b.visit_time
上述版本的另一版本(取决于数据质量-请参阅底部的注释)
#standardSQL
SELECT
t.idCustomer,
t.idDevice,
t.visit_time,
COALESCE(a.channel_name, b.channel_name) channel_name,
COALESCE(a.medium_name, b.medium_name) medium_name,
FROM `project.dataset.XYZ` t
LEFT JOIN `project.dataset.A` a
ON t.idCustomer = a.idCustomer AND t.visit_time = a.visit_time
LEFT JOIN `project.dataset.B` b
ON t.idDevice = b.idDevice AND t.visit_time = b.visit_time
注意:如果相应的匹配表中的channel_name和medium_name列都不为NULL,则此(第二个)版本将正常工作-否则最终可能会导致A中有一个字段,B中有另一个字段-因此第一个版本将