如何查询像亚马逊这样的推荐书籍呢?

时间:2012-03-17 18:38:53

标签: mysql sql database

我正在为书店开发一个门户网站,我希望能够为用户推荐书籍。

我想要与amazon.com类似的东西,当用户订购书籍A时,系统应该提供其他建议书籍的列表。如果存在同时购买A和B的用户Bob,则建议使用书B.此外,我希望我的系统返回按销售数量减少排序的建议书籍,并且仅计算已购买两本书的用户的销售额(如Bob)。

以下是重要的表格:

书(ISBN,title,publicationYear等)

订单(orderID,loginName,date)

BooksOrdered(orderID,ISBN,count)

此查询比我之前尝试过的任何内容都要复杂。

当前的想法:

首先找到订购了同一本书(ISBN)的所有用户

  
      
  1. 加入Book.ISBN = BooksOrdered.ISBN和Orders.orderID = BooksOrdered.ISBN
  2. 上的所有三个表格   
  3. WHERE Book.ISBN = bookInQuestionISBN
  4.   
  5. GROUP BY Orders.loginName
  6.   
  7. 投出loginName
  8.   

类似于:

SELECT Orders.loginName as otherBuyerLoginName
FROM Book, Orders, BooksOrdered,
WHERE Book.ISBN = bookInQuestionISBN AND Orders.orderID = BooksOrdered.ISBN
GROUP BY Orders.loginName

然后我可以抓取这些loginNames订购的所有书籍,按loginName,总计和ORDER BY DESC SUM(BooksOrdered.count)对它们进行分组。

但是,我认为第一个结果很可能就是这本书。我不想建议用户刚买的那本书。

你有什么建议?也许我应该从头开始?

编辑:

以下是一些数据:

BooksOrdered包含:

orderID ISBN        count
    3   FakeISBN    3
    7   FakeISBN    3
    8   FakeISBN    100
    11  FakeISBN2   40
    7   FakeISBN2   4
    10  FakeISBN2   20
    10  FakeISBN3   34
    11  TesterISBN  3
    9   TesterISBN  1

订单包含:

orderID loginName  date
2       Tester     2012-03-15 19:43:27
3       Tester     2012-03-16 15:56:55
6       Tester2    2012-03-16 17:28:02
7       Tester     2012-03-16 17:31:21
8       ni3hao3    2012-03-16 23:18:15
9       ni3hao3    2012-03-17 13:12:38
10      ni3hao3    2012-03-17 13:13:55
11      Bobby      2012-03-17 13:28:14

好的,现在我想知道ISBN =“TesterISBN”这本书的最佳建议

两个人订购了“TesterISBN”:ni3hao3和Bobby

ni3hao3的总销售历史:

1 copy of "TesterISBN"
100 copies of "FakeISBN"
20 copies of "FakeISBN2"
34 copies of "FakeISBN3"

Bobby的总销售历史:

3 copies of "TesterISBN"
40 copies of "FakeISBN2"

因此,“TesterISBN”购买者的销售总额如下:

4 copies of "TesterISBN"
100 copies of "FakeISBN"
60 copies of "FakeISBN2"
34 copies of "FakeISBN3"

所以我希望结果返回:

FakeISBN
FakeISBN2
FakeISBN3

按顺序。

编辑:

我相信我已经明白了:

SELECT Bo.ISBN, B.title, SUM(Bo.count) 
FROM BooksOrdered Bo, Orders O, Book B
WHERE Bo.orderID = O.orderID AND Bo.ISBN = B.ISBN
                            AND Bo.ISBN != 'TesterISBN'
                            AND O.loginName IN ( SELECT DISTINCT(Orders.loginName) as otherBuyerLoginName
                            FROM Orders, BooksOrdered
                            WHERE BooksOrdered.ISBN = 'TesterISBN' 
                                AND Orders.orderID = BooksOrdered.orderID)
GROUP BY Bo.ISBN
ORDER BY SUM(Bo.count) DESC

3 个答案:

答案 0 :(得分:0)

在我看来,你可以通过两个步骤来解决这个问题:

  1. 查找哪些用户购买了相关图书并获取所有订单的ID
  2. 按照这些订单对所有其他书籍进行分组并计算出现次数(即与所述书籍一起购买的订单数量)
  3. 它会转化为这样的东西:

    SELECT bo.ISBN, COUNT(*) AS timesBoughtTogether
    FROM (SELECT DISTINCT(o.orderID) FROM Orders o
          LEFT JOIN BooksOrdered bo ON o.orderID = bo.orderID
          WHERE bo.ISBN = 'ISBN to provide suggestions for') relevantOrders
    LEFT JOIN BooksOrdered bo ON relevantOrders.orderID = bo.orderID
    WHERE bo.ISBN != 'ISBN to provide suggestions for'
    GROYP BY bo.ISBN
    ORDER BY timesBoughtTogether DESC
    

    我实际上没有运行它,所以我希望语法不会关闭。

答案 1 :(得分:0)

我在SQL Server上写过这个,但转换为MySQL语法应该是微不足道的。

此查询根据相关orderID@currentOrderID)中的图书返回推荐内容:

select b.ISBN, b.title, sum(bo.count) as ranking
from Book b inner join 
    BooksOrdered bo on b.ISBN = bo.ISBN inner join
    (   select distinct orderID
        from BooksOrdered bo
        where bo.ISBN in (  select ISBN
                            from BooksOrdered bo1
                            where bo1.orderID = @currentOrderID )
    ) o on bo.orderID = o.orderID
where b.ISBN not in ( select ISBN
                      from BooksOrdered bo1
                      where bo1.orderID = @currentOrderID )
group by b.ISBN, b.title
order by sum(bo.count) desc

这个基于发出当前订单的登录的整个订单历史记录返回建议:

select b.ISBN, b.title, sum(bo.count) as ranking
from Book b inner join 
    BooksOrdered bo on b.ISBN = bo.ISBN inner join
    (   select bo.orderID
        from BooksOrdered bo
        where bo.ISBN in (  select ISBN
                            from BooksOrdered bo1
                            where bo1.orderID = @currentOrderID )
    ) o on bo.orderID = o.orderID
where b.ISBN not in (   select ISBN
                        from Orders o inner join
                            BooksOrdered bo on o.orderID = bo.orderID
                        where o.loginName = (   select loginName
                                                from Orders
                                                where orderID = @currentOrderID ) )
group by b.ISBN, b.title
order by sum(bo.count) desc

希望这些能为您提供所需的产品!

答案 2 :(得分:0)

SELECT Bo.ISBN, B.title, SUM(Bo.count) 
FROM BooksOrdered Bo, Orders O, Book B
WHERE Bo.orderID = O.orderID AND Bo.ISBN = B.ISBN
                            AND Bo.ISBN != 'TesterISBN'
                            AND O.loginName IN ( SELECT DISTINCT(Orders.loginName) as otherBuyerLoginName
                            FROM Orders, BooksOrdered
                            WHERE BooksOrdered.ISBN = 'TesterISBN' 
                                AND Orders.orderID = BooksOrdered.orderID)
GROUP BY Bo.ISBN
ORDER BY SUM(Bo.count) DESC

这似乎可以解决问题