选择尚未购买产品X的客户ID

时间:2017-11-16 17:59:15

标签: sql google-bigquery self-join

我有一张客户ID和购买的产品表。客户ID可以随着时间的推移购买多个产品。

customerID,productID

example table

在BigQuery中,我需要为那些尚未购买产品A的人找到CustomerID。

我一直在试图做自我加入,内连接,但我一无所知。

任何帮助表示感谢。

3 个答案:

答案 0 :(得分:3)

select customerID
from your_table
group by customerID
having sum(case when productID = 'A' then 1 else 0 end) = 0

并检查它是否只包含名称

sum(case when productID contains 'XYZ' then 1 else 0 end) = 0

答案 1 :(得分:1)

以下是BigQuery Standard SQL

#standardSQL
SELECT CustomerID
FROM `project.dataset.yourTable`
GROUP BY CustomerID
HAVING COUNTIF(Product = 'A') = 0

您可以使用虚拟数据进行测试/播放,如下所示

#standardSQL
WITH `project.dataset.yourTable` AS (
  SELECT 1234 CustomerID, 'A' Product UNION ALL
  SELECT 11234, 'A' UNION ALL
  SELECT 4567, 'A' UNION ALL
  SELECT 7896, 'C' UNION ALL
  SELECT 5432, 'B' 
)
SELECT CustomerID
FROM `project.dataset.yourTable`
GROUP BY CustomerID
HAVING COUNTIF(Product = 'A') = 0  
  

我如何调整它,因此它可能是productID包含“xyz”

#standardSQL
WITH `project.dataset.yourTable` AS (
  SELECT 1234 CustomerID, 'Axyz' Product UNION ALL
  SELECT 11234, 'A' UNION ALL
  SELECT 4567, 'A' UNION ALL
  SELECT 7896, 'Cxyz' UNION ALL
  SELECT 5432, 'B' 
)
SELECT CustomerID
FROM `project.dataset.yourTable`
GROUP BY CustomerID
HAVING COUNTIF(REGEXP_CONTAINS(Product, 'xyz')) = 0

答案 2 :(得分:0)

如果您有客户表,则可能需要:

select c.*
from customers c
where not exists (select 1 from t where t.customer_id = c.customer_id and t.proectID = 'A');

这将退回没有购买的客户以及购买除产品A以外的客户。当然,您的数据中客户的定义可能是客户已购买,在这种情况下我喜欢Juergen的解决方案。