查询在一组共同但不连续的日期上进行购买的客户

时间:2016-06-30 23:37:29

标签: sql postgresql

RDMS:PostgreSQL 9.5.3

我有以下表格的表格('活动'):

customerID | date           | purchaseID
-----------------------------------------
1          | 2016-01-01     | 1
2          | 2016-01-01     | 2
3          | 2016-01-01     | 3
2          | 2016-01-02     | 4
1          | 2016-01-03     | 5
2          | 2016-01-03     | 6
3          | 2016-01-03     | 7
1          | 2016-01-04     | 8
2          | 2016-01-04     | 9
3          | 2016-01-05     | 10

从此表中,我想查找在与customerID 1相同的日期进行购买的所有客户。客户购买历史记录需要与customerID 1完全重叠,但不一定限于此 - 除了日期很好,但不应在最终结果中退回。

以上数据的结果应为:

customerID | date           | purchaseID
-----------------------------------------
2          | 2016-01-01     | 2
2          | 2016-01-02     | 5
2          | 2016-01-03     | 8

目前,我正在通过应用程序代码中的循环解决这个问题,然后删除所有NULL结果,所以实际的SQL是:

SELECT customerID,
       date,
       purchaseID
FROM activity
WHERE customerID <> 1
   AND date = %date%

其中%date%是通过customerID 1购买的所有日期的迭代变量。这不是一个优雅的解决方案,对于大量购买(数百万)或客户(数万)而言极其缓慢。欢迎大家提出意见。

感谢您阅读 -

2 个答案:

答案 0 :(得分:0)

一种方法是使用自联接和聚合:

select a.customerid
from activity a join 
     activity a1
     on a1.date = a.date and a1.customerid = 1
where a1.customerid <> a.customerid
group by a.customerID
having count(distinct a1.date) = (select count(distinct date) from activity where customerID = 1)

如果您想要原始记录,可以使用:

select a.*
from activity a
where a.customerId in (select a.customerid
                       from activity a join 
                            activity a1
                            on a1.date = a.date and a1.customerid = 1
                       where a1.customerid <> a.customerid
                       group by a.customerID
                       having count(distinct a1.date) = (select count(distinct date) from activity where customerID = 1)
                      );

答案 1 :(得分:0)

您可以使用“contains”@>数组运算符:

with activity (customerID, date, purchaseID) AS (
  values  (1, '2016-01-01'::date, 1), (2, '2016-01-01', 2), (3, '2016-01-01', 3),
          (2, '2016-01-02', 4), (1, '2016-01-03', 5), (2, '2016-01-03', 6),
          (3, '2016-01-03', 7), (1, '2016-01-04', 8), (2, '2016-01-04', 9),
          (3, '2016-01-05', 10))
select customerID
from activity
group by customerID
having customerID <> 1 AND
       array_agg(date) @> array(select date from activity where customerID = 1)