我需要加入两个表:
我做了类似的事情:
SELECT
CASE
WHEN b.domain IS NULL then "Invalid"
ELSE "Valid"
END as Validated
FROM Emails e
LEFT JOIN DomainBlacklist b
ON ENDS_WITH(LOWER(e.email), LOWER(b.domain))
但是却引发了一个错误:
"如果没有来自连接两边的字段相等的条件,则不能使用LEFT OUTER JOIN。"
有人知道我该如何解决这个问题?
谢谢!
答案 0 :(得分:2)
理论上应该可以将此表达为具有相等性的联接;您需要先从电子邮件地址中删除@
:
SELECT
CASE
WHEN b.domain IS NULL then "Invalid"
ELSE "Valid"
END as Validated
FROM Emails e
LEFT JOIN DomainBlacklist b
ON LOWER(SPLIT(e.email, '@')[SAFE_OFFSET(1)]) = LOWER(b.domain)
使用样本数据:
WITH Emails AS (
SELECT 'elliott@example.com' AS email UNION ALL
SELECT 'a@b.com' UNION ALL
SELECT 'invalid_email' UNION ALL
SELECT 'foo@bar.com'
), DomainBlacklist AS (
SELECT 'example.com' AS domain UNION ALL
SELECT 'bar.com'
)
SELECT
CASE
WHEN b.domain IS NULL then "Invalid"
ELSE "Valid"
END as Validated
FROM Emails e
LEFT JOIN DomainBlacklist b
ON LOWER(SPLIT(e.email, '@')[SAFE_OFFSET(1)]) = LOWER(b.domain)
答案 1 :(得分:2)
以下是BigQuery Standard SQL
#standardSQL
SELECT email,
IF(MAX(ENDS_WITH(LOWER(email), LOWER(domain))), 'invalid', 'valid') AS Validated
FROM `project.dataset.Emails`
CROSS JOIN `project.dataset.DomainBlacklist`
GROUP BY email
您可以使用虚拟数据测试/播放上述查询,如下所示
#standardSQL
WITH `project.dataset.Emails` AS (
SELECT email
FROM UNNEST(['user1@abc.com','user2@abc.com','user3@uvw.com','user4@xyz.com']) AS email
), `project.dataset.DomainBlacklist` AS (
SELECT domain
FROM UNNEST(['uvw.com','qwe.net']) AS domain
)
SELECT email,
IF(MAX(ENDS_WITH(LOWER(email), LOWER(domain))), 'invalid', 'valid') AS Validated
FROM `project.dataset.Emails`
CROSS JOIN `project.dataset.DomainBlacklist`
GROUP BY email
结果是
email Validated
user1@abc.com valid
user2@abc.com valid
user3@uvw.com invalid
user4@xyz.com valid