我有一张包含跟踪数据的表格。除其他值外,该表还包含traffic_medium,traffic_source和traffic_campaign列。列有时包含(none)或null作为值。
我想使用左边的连接,中间,scource和广告系列作为匹配条件来匹配来自其他桌子的访问者总数。
如果所有列都包含数据,则此方法正常。如果一列有(无)或null为值,则它不起作用。
我使用BigQuery和旧版SQL。
SELECT
A.id,
A.trafficSource_medium,
A.trafficSource_source,
A.trafficSource_campaign,
B.sum_visitor AS sum_visitor
FROM [table] AS A
left outer join (Select
count(distinct fullvisitorID) as sum_visitor,
trafficSource_medium,
trafficSource_source,
trafficSource_campaign
FROM [table2]
GROUP BY trafficSource_medium,
trafficSource_source,
trafficSource_campaign)
AS B
on A.trafficSource_medium=B.trafficSource_medium AND
A.trafficSource_source=B.trafficSource_source AND
A.trafficSource_campaign=B.trafficSource_campaign
感谢您的帮助!
答案 0 :(得分:0)
尝试下面的内容 假设各个字段是STRING类型。如果它们是INT - 将'n / a'替换为让我们说-999 - 重要的是选择不用作相应字段值的常量
#legacySQL
SELECT
A.id,
CASE WHEN A.trafficSource_medium = 'n/a' THEN NULL ELSE A.trafficSource_medium END AS trafficSource_medium,
CASE WHEN A.trafficSource_source = 'n/a' THEN NULL ELSE A.trafficSource_source END AS trafficSource_source,
CASE WHEN A.trafficSource_campaign = 'n/a' THEN NULL ELSE A.trafficSource_campaign END AS trafficSource_campaign,
B.sum_visitor AS sum_visitor
FROM (
SELECT
id,
IFNULL(trafficSource_medium, 'n/a') AS trafficSource_medium,
IFNULL(trafficSource_source, 'n/a') AS trafficSource_source,
IFNULL(trafficSource_campaign 'n/a') AS trafficSource_campaign
FROM [table]
) AS A
LEFT OUTER JOIN (
SELECT
COUNT(DISTINCT fullvisitorID) AS sum_visitor,
IFNULL(trafficSource_medium, 'n/a') AS trafficSource_medium,
IFNULL(trafficSource_source, 'n/a') AS trafficSource_source,
IFNULL(trafficSource_campaign 'n/a') AS trafficSource_campaign
FROM [table2]
GROUP BY
trafficSource_medium,
trafficSource_source,
trafficSource_campaign
) AS B
ON A.trafficSource_medium = B.trafficSource_medium
AND A.trafficSource_source = B.trafficSource_source
AND A.trafficSource_campaign = B.trafficSource_campaign
这里的想法是将NULL“转换”为某个值,因此它们是JOIN'able - 然后在最终的SELECT中将其“转换”为NULL
如果你可以迁移到标准SQL - 你可以尝试下面的代码 - 它做的改动较少 - 主要是在ON子句中
#standardSQL
SELECT
A.id,
A.trafficSource_medium,
A.trafficSource_source,
A.trafficSource_campaign,
B.sum_visitor AS sum_visitor
FROM `table` AS A
LEFT OUTER JOIN (
SELECT
COUNT(DISTINCT fullvisitorID) AS sum_visitor,
trafficSource_medium,
trafficSource_source,
trafficSource_campaign
FROM `table2`
GROUP BY
trafficSource_medium,
trafficSource_source,
trafficSource_campaign
) AS B
ON IFNULL(A.trafficSource_medium, 'n/a') = IFNULL(B.trafficSource_medium, 'n/a')
AND IFNULL(A.trafficSource_source, 'n/a') = IFNULL(B.trafficSource_source, 'n/a')
AND IFNULL(A.trafficSource_campaign, 'n/a') = IFNULL(B.trafficSource_campaign, 'n/a')