查找客户尚未购买的产品的平均评级

时间:2015-03-08 17:12:19

标签: mysql sql aggregate-functions

我试图为每个客户找到他们没有购买的产品,以及哪些产品的评级最高。

例如,在下表中,John已经购买了第1项和第2项,但没有购买3,4或5.产品3和5没有评级,因此它们不会被包括在内,但是产品应该包括4,因为它是约翰没有购买的评价最高的项目。

以下是包含一些示例数据的表格结构:

客户

id  | customer
----|---------
1   | john
2   | jenkins
3   | jane
4   | janet

产品

id  | description
----|---------
1   | deoderant
2   | soap
3   | shampoo
4   | razor
5   | sponge

订单

customer_id  | product_id
-------------|---------
1            | 1
1            | 2
2            | 3
2            | 4
3            | 5

customer_id  | product_id | rate
-------------|------------|-------
1            | 1          | 3
2            | 2          | 2
2            | 4          | 3
4            | 2          | 4

2 个答案:

答案 0 :(得分:1)

如果您想为一位客户执行此操作,只需使用order bylimit

select c.*, r.*
from customers c cross join
     (select r.product_id, avg(rating) avgr
      from rating r
      group by r.product_id
     ) r left join
     orders o
     on o.customer_id = c.customer_id and
        o.product_id = r.product_id 
where c.customer_id = @customerid and o.product_id is null
order by r.avgr desc
limit 1;

如果您想同时为所有客户提供服务,那就有点复杂了。一种方法是使用substring_index() / group_concat()技巧:

select c.*,
       substring_index(group_concat(r.product_id order by avgr desc), ',', 1) as product_id
from customers c cross join
     (select r.product_id, avg(rating) avgr
      from rating r
      group by r.product_id
     ) r left join
     orders o
     on o.customer_id = c.customer_id and
        o.product_id = r.product_id 
where c.customer_id = @customerid and o.product_id is null
group by c.customer_id;

答案 1 :(得分:1)

我开始编写几个子查询,然后将它们拼凑在一起。我的个人建议是在完成整个解决方案之前始终将问题分解为较小的位。

例如,我需要知道的一件事是每个客户都没有购买的所有产品。我这样做是通过交叉加入客户和产品表(以获得所有配对)并删除已存在于订单表中的对,如下所示:

-- Get all customer/product pairings where customer_product
-- does not exist in orders table
SELECT c.id, p.id
FROM customer c
CROSS JOIN product p
WHERE (c.id, p.id) NOT IN (SELECT * FROM orders)
ORDER BY c.id;

我还写了一个子查询来获得每个产品的平均评分。如果产品没有评级,则此查询将返回null:

SELECT p.id, AVG(r.rate) AS averageRating
FROM product p
LEFT JOIN rate r ON r.product_id = p.id
GROUP BY p.id;

现在,我可以将这两个作为子查询包含在内,并选择客户ID,产品ID以及他们尚未购买的每种产品的评级:

SELECT t1.customerID, t1.productID, t2.averageRating
FROM(
  SELECT c.id AS customerID, p.id AS productID
  FROM customer c
  CROSS JOIN product p
  WHERE (c.id, p.id) NOT IN (SELECT * FROM orders)
  ORDER BY c.id) t1
JOIN(
  SELECT p.id AS productID, AVG(r.rate) AS averageRating
  FROM product p
  LEFT JOIN rate r ON r.product_id = p.id
  GROUP BY p.id) t2 ON t2.productID = t1.productID;

这是最难的部分。剩下要做的唯一事情是通过一些聚合来获得每个客户从他们尚未购买的商品中获得的最大评级,然后在最大评级与平均评级匹配的条件下将该聚合查询与上述查询相结合。所以,这是我放在一起的怪异查询:

SELECT t1.customerID, t1.productID, t1.averageRating
FROM(
  SELECT t1.customerID, t1.productID, t2.averageRating
  FROM(
    SELECT c.id AS customerID, p.id AS productID
    FROM customer c
    CROSS JOIN product p
    WHERE (c.id, p.id) NOT IN (SELECT * FROM orders)
    ORDER BY c.id) t1
  JOIN(
    SELECT p.id AS productID, AVG(r.rate) AS averageRating
    FROM product p
    LEFT JOIN rate r ON r.product_id = p.id
    GROUP BY p.id) t2 ON t2.productID = t1.productID) t1
JOIN(
  SELECT t1.customerID, MAX(t2.averageRating) AS maxRating
  FROM(
    SELECT c.id AS customerID, p.id AS productID
    FROM customer c
    CROSS JOIN product p
    WHERE (c.id, p.id) NOT IN (SELECT * FROM orders)
    ORDER BY c.id) t1
  JOIN(
    SELECT p.id AS productID, AVG(r.rate) AS averageRating
    FROM product p
    LEFT JOIN rate r ON r.product_id = p.id
    GROUP BY p.id) t2 ON t2.productID = t1.productID
  GROUP BY t1.customerID) t2 ON t2.customerID = t1.customerID AND t2.maxRating = t1.averageRating
ORDER BY t1.customerID;

这是MySQL工作台结果的快照: enter image description here

需要注意的一点是,我没有消除关系。因此,例如,客户2没有购买产品1或2并且它们具有相同的评级,因此返回两行。

我在MySQL中进行了测试,因为SQL Fiddle无法正常工作,但我已经开始工作了,所以如果您愿意的话,这里有一个Fiddle示例。