基于用户标识的产品项的SQL关联

时间:2017-07-24 10:24:37

标签: sql

这是一个示例数据集:

    | user_id | product_id | dt       | quantity | price
    | 1       | a          |2017-05-20| 2        | 3.95
    | 1       | b          |2017-06-02| 7        | 19.95
    | 2       | a          |2017-06-23| 4        | 5.99
    | 2       | b          |2017-04-03| 2        | 19.95
    | 2       | c          |2017-06-08| 1        | 9.99
    | 3       | a          |2017-07-02| 4        | 4.98
    | 3       | c          |2017-06-05| 3        | 18.95

提供一个返回项目对的SQL查询(即item_id s对),并计算至少订购过该项目的用户数量(为简单起见,我们赢了&t; t获取订单的频率或购买的商品数量 - 只要用户是否购买了特定商品。对于上面的示例数据,输出应为:

    | item_id_1 | item_id_2 | num_users |
    | a         | b         | 2         |
    | a         | c         | 2         |
    | b         | c         | 1         |

3 个答案:

答案 0 :(得分:0)

您可以使用自联接来执行此操作:

select e.product_id, e2.product_id as product_id_2,
       count(distinct e.user_id) as num_users
from example e join
     example e2
     on e.user_id = e2.user_id
group by e.product_id, e2.product_id
order by num_users desc;

答案 1 :(得分:0)

select a.product_id as item_id_1, b.product_id as item_id_2, COUNT(*) num_users 
from orders a 
join orders b 
on a.user_id = b.user_id and a.product_id < b.product_id 
group by a.product_id, b.product_id 
order by num_users desc;

答案 2 :(得分:0)

假设用户可以多次订购同一产品,最好先对用户和产品进行分组。

然后,这两个分组结果将在同一个user_id和另一个product_id上连接 在这种情况下,product_id较低,因为我们只想要f.e.组合&#39; a&#39; &安培; &#39; B&#39;而不是它的反向组合&#39; b&#39; &安培; &#39;一个&#39;

之后,只需要将其与计数分组。

select 
 t1.product_id as item_id_1, 
 t2.product_id as item_id_2, 
 count(t1.user_id) as num_users
from 
(
    select user_id, product_id 
    from YourTable
    group by user_id, product_id
) t1
join (
    select user_id, product_id 
    from YourTable 
    group by user_id, product_id
) t2 on (t1.user_id = t2.user_id and t1.product_id < t2.product_id)
group by t1.product_id, t2.product_id
order by t1.product_id, t2.product_id

如果您的数据库支持WITH子句,那么您可以将相同的子查询放在公用表表达式中并重新使用它。

WITH CTE as (
    select user_id, product_id 
    from YourTable
    group by user_id, product_id
)
select 
 t1.product_id as item_id_1, 
 t2.product_id as item_id_2, 
 count(t1.user_id) as num_users
from CTE t1
join CTE t2 on (t1.user_id = t2.user_id and t1.product_id < t2.product_id)
group by t1.product_id, t2.product_id
order by t1.product_id, t2.product_id