MySQL:加入关键字表,但只带回最受欢迎的关键字

时间:2010-10-14 22:32:15

标签: mysql

我有一张“产品”表。我还有一个用户标记该产品关键字的表格。我想根据每个产品的数量来恢复每个产品的热门关键字。

关键字表基本上由关键字,主键和将其链接到Products表的外键组成。

我认为我必须加入关键字表(如下所示),但我不知道如何按最受欢迎的方式订购。

这是我已经拥有的SQL - 它目前只会带回任何关键字而不是顶层关键字。

SELECT product_name,keyword_keyword 
FROM products 
LEFT JOIN keywords ON keyword_pid = product_id
GROUP BY product_id

2 个答案:

答案 0 :(得分:1)

我知道这可以采用不同的方式,也可能更有效率,但这就是我的思维方式将其分解出来的:

select a.product_name, b.keyword_keyword, count(*) as keyword_count 
into #temp1
from products a 
join keywords b on a.product_id = b.keyword_pid 
group by a.product_name, b.keyword_keyword

select x.product_name, x.keyword_keyword
from #temp1 x
where x.keyword_count = (select MAX(keyword_count) from #temp1 
                         where product_name = x.product_name)

答案 1 :(得分:0)

以下是我用来达到我提议的解决方案的SQL的进展(以及示例结果):

以下是关键字计数:

SELECT k.*,
       COUNT(k.keyword_keyword)
FROM   keywords k
GROUP  BY k.keyword_pid,
          k.keyword_keyword  

+------------+-------------+-----------------+--------------------------+
| keyword_id | keyword_pid | keyword_keyword | count(k.keyword_keyword) |
+------------+-------------+-----------------+--------------------------+
|          3 |           1 | red             |                        3 | 
|          1 |           1 | widgety         |                        3 | 
|          9 |           2 | curve           |                        1 | 
|         10 |           2 | red             |                        2 | 
|          6 |           2 | screwy          |                        3 | 
|         12 |           3 | red             |                        1 | 
|          7 |           3 | spike           |                        2 | 
+------------+-------------+-----------------+--------------------------+

我们需要找到每个(keyword_pid,keyword_keyword)对的最大值。 这有一个tried and true idiom

SELECT t1.*,
       t2.*
FROM   (SELECT k.*,
               COUNT(k.keyword_keyword) cnt
        FROM   keywords k
        GROUP  BY k.keyword_pid,
                  k.keyword_keyword) t1
       LEFT JOIN (SELECT k.*,
                         COUNT(k.keyword_keyword) cnt
                  FROM   keywords k
                  GROUP  BY k.keyword_pid,
                            k.keyword_keyword) t2
         ON t1.keyword_pid = t2.keyword_pid
            AND t1.cnt < t2.cnt  

请注意,上面我重复了两次相同的SELECT。我假设MySQL缓存了第一个SELECT的结果,所以第二个应该非常快。 如果我错了,我希望有人能够消除我的信念。

+------------+-------------+-----------------+-----+------------+-------------+-----------------+------+
| keyword_id | keyword_pid | keyword_keyword | cnt | keyword_id | keyword_pid | keyword_keyword | cnt  |
+------------+-------------+-----------------+-----+------------+-------------+-----------------+------+
|          3 |           1 | red             |   3 |       NULL |        NULL | NULL            | NULL | 
|          1 |           1 | widgety         |   3 |       NULL |        NULL | NULL            | NULL | 
|          9 |           2 | curve           |   1 |         10 |           2 | red             |    2 | 
|          9 |           2 | curve           |   1 |          6 |           2 | screwy          |    3 | 
|         10 |           2 | red             |   2 |          6 |           2 | screwy          |    3 | 
|          6 |           2 | screwy          |   3 |       NULL |        NULL | NULL            | NULL | 
|         12 |           3 | red             |   1 |          7 |           3 | spike           |    2 | 
|          7 |           3 | spike           |   2 |       NULL |        NULL | NULL            | NULL | 
+------------+-------------+-----------------+-----+------------+-------------+-----------------+------+

t2.cnt is NULL行包含每个(keyword_pid,keyword_keyword)对的最大计数的行(这是找到最大值的惯用语的一部分):

SELECT t1.*
FROM   (SELECT k.*,
               COUNT(k.keyword_keyword) cnt
        FROM   keywords k
        GROUP  BY k.keyword_pid,
                  k.keyword_keyword) t1
       LEFT JOIN (SELECT k.*,
                         COUNT(k.keyword_keyword) cnt
                  FROM   keywords k
                  GROUP  BY k.keyword_pid,
                            k.keyword_keyword) t2
         ON t1.keyword_pid = t2.keyword_pid
            AND t1.cnt < t2.cnt
WHERE  t2.cnt IS NULL  

+------------+-------------+-----------------+-----+
| keyword_id | keyword_pid | keyword_keyword | cnt |
+------------+-------------+-----------------+-----+
|          3 |           1 | red             |   3 | 
|          1 |           1 | widgety         |   3 | 
|          6 |           2 | screwy          |   3 | 
|          7 |           3 | spike           |   2 | 
+------------+-------------+-----------------+-----+

其余相对容易。首先,我们加入产品表,以便我们可以看到哪些产品与哪个关键字相关联:

SELECT p.*,
       t1.*
FROM   (SELECT k.*,
               COUNT(k.keyword_keyword) cnt
        FROM   keywords k
        GROUP  BY k.keyword_pid,
                  k.keyword_keyword) t1
       LEFT JOIN (SELECT k.*,
                         COUNT(k.keyword_keyword) cnt
                  FROM   keywords k
                  GROUP  BY k.keyword_pid,
                            k.keyword_keyword) t2
         ON t1.keyword_pid = t2.keyword_pid
            AND t1.cnt < t2.cnt
       LEFT JOIN product p
         ON p.product_id = t1.keyword_pid
WHERE  t2.cnt IS NULL  

+------------+--------------+------------+-------------+-----------------+-----+
| product_id | product_name | keyword_id | keyword_pid | keyword_keyword | cnt |
+------------+--------------+------------+-------------+-----------------+-----+
|          1 | widget       |          3 |           1 | red             |   3 | 
|          1 | widget       |          1 |           1 | widgety         |   3 | 
|          2 | screw        |          6 |           2 | screwy          |   3 | 
|          3 | nail         |          7 |           3 | spike           |   2 | 
+------------+--------------+------------+-------------+-----------------+-----+

如果您想要关键关系,以上是解决方案。 如果你想摆脱关系(随机),你可以使用另一个GROUP BY

SELECT p.*,
       t1.*
FROM   (SELECT k.*,
               COUNT(k.keyword_keyword) cnt
        FROM   keywords k
        GROUP  BY k.keyword_pid,
                  k.keyword_keyword) t1
       LEFT JOIN (SELECT k.*,
                         COUNT(k.keyword_keyword) cnt
                  FROM   keywords k
                  GROUP  BY k.keyword_pid,
                            k.keyword_keyword) t2
         ON t1.keyword_pid = t2.keyword_pid
            AND t1.cnt < t2.cnt
       LEFT JOIN product p
         ON p.product_id = t1.keyword_pid
WHERE  t2.cnt IS NULL
GROUP  BY p.product_id  

+------------+--------------+------------+-------------+-----------------+-----+
| product_id | product_name | keyword_id | keyword_pid | keyword_keyword | cnt |
+------------+--------------+------------+-------------+-----------------+-----+
|          1 | widget       |          3 |           1 | red             |   3 | 
|          2 | screw        |          6 |           2 | screwy          |   3 | 
|          3 | nail         |          7 |           3 | spike           |   2 | 
+------------+--------------+------------+-------------+-----------------+-----+