当key为null时,Bigquery会合并连接

时间:2016-03-22 17:53:42

标签: google-bigquery

表1

+---------+-----------+--------+
| user_id | email     | action |
+---------+-----------+--------+
| 1       | aa@aa.com | open   |
+---------+-----------+--------+
| 2       | null      | click  |
+---------+-----------+--------+
| 3       | ac@ac.com | click  |
+---------+-----------+--------+
| 4       | ad@ad.com | open   |
+---------+-----------+--------+

表2

+---------+-----------+--------+
| user_id | email     | event  |
+---------+-----------+--------+
| 1       | aa@aa.com | sent   |
+---------+-----------+--------+
| null    | ac@ac.com | none   |
+---------+-----------+--------+
| 2       | ab@ab.com | sent   |
+---------+-----------+--------+
| 4       | ad@ad.com | sent   |
+---------+-----------+--------+

我想基于t1.user_id = t2.user_id加入,但当密钥为空时,加入t1.email = t2.email

我在bigquery中尝试了多种方式来加入:
1.)ON COALESCE(t1.user_id,t1.email)= COALESCE(t2.user_id,t2.email)
2.)如果t2.user_id不为null,则为t1.user_id = t2.user_id else t1.email = t2.email end

都没有工作。怎么办呢?

1 个答案:

答案 0 :(得分:1)

我会将这种联接分成两个单独的:
首先 - 通过user_id加入

SELECT *
FROM table1 AS t1
JOIN table2 AS t2
ON t1.user_id = t2.user_id

第二 - 通过电子邮件加入第一次加入时遗漏的那些ID

SELECT * 
FROM (
  SELECT * FROM table1
  WHERE user_id NOT IN (
    SELECT t1.user_id
    FROM table1 AS t1
    JOIN table2 AS t2
    ON t1.user_id = t2.user_id
  )
) t1
JOIN (
  SELECT * FROM table2
  WHERE user_id NOT IN (
    SELECT t1.user_id
    FROM table1 AS t1
    JOIN table2 AS t2
    ON t1.user_id = t2.user_id
  )
) t2
ON t1.email = t2.email