我有一张“产品”表。我还有一个用户标记该产品关键字的表格。我想根据每个产品的数量来恢复每个产品的热门关键字。
关键字表基本上由关键字,主键和将其链接到Products表的外键组成。
我认为我必须加入关键字表(如下所示),但我不知道如何按最受欢迎的方式订购。
这是我已经拥有的SQL - 它目前只会带回任何关键字而不是顶层关键字。
SELECT product_name,keyword_keyword
FROM products
LEFT JOIN keywords ON keyword_pid = product_id
GROUP BY product_id
答案 0 :(得分:1)
我知道这可以采用不同的方式,也可能更有效率,但这就是我的思维方式将其分解出来的:
select a.product_name, b.keyword_keyword, count(*) as keyword_count
into #temp1
from products a
join keywords b on a.product_id = b.keyword_pid
group by a.product_name, b.keyword_keyword
select x.product_name, x.keyword_keyword
from #temp1 x
where x.keyword_count = (select MAX(keyword_count) from #temp1
where product_name = x.product_name)
答案 1 :(得分:0)
以下是我用来达到我提议的解决方案的SQL的进展(以及示例结果):
以下是关键字计数:
SELECT k.*,
COUNT(k.keyword_keyword)
FROM keywords k
GROUP BY k.keyword_pid,
k.keyword_keyword
+------------+-------------+-----------------+--------------------------+
| keyword_id | keyword_pid | keyword_keyword | count(k.keyword_keyword) |
+------------+-------------+-----------------+--------------------------+
| 3 | 1 | red | 3 |
| 1 | 1 | widgety | 3 |
| 9 | 2 | curve | 1 |
| 10 | 2 | red | 2 |
| 6 | 2 | screwy | 3 |
| 12 | 3 | red | 1 |
| 7 | 3 | spike | 2 |
+------------+-------------+-----------------+--------------------------+
我们需要找到每个(keyword_pid,keyword_keyword)
对的最大值。
这有一个tried and true idiom:
SELECT t1.*,
t2.*
FROM (SELECT k.*,
COUNT(k.keyword_keyword) cnt
FROM keywords k
GROUP BY k.keyword_pid,
k.keyword_keyword) t1
LEFT JOIN (SELECT k.*,
COUNT(k.keyword_keyword) cnt
FROM keywords k
GROUP BY k.keyword_pid,
k.keyword_keyword) t2
ON t1.keyword_pid = t2.keyword_pid
AND t1.cnt < t2.cnt
请注意,上面我重复了两次相同的SELECT
。我假设MySQL缓存了第一个SELECT
的结果,所以第二个应该非常快。
如果我错了,我希望有人能够消除我的信念。
+------------+-------------+-----------------+-----+------------+-------------+-----------------+------+
| keyword_id | keyword_pid | keyword_keyword | cnt | keyword_id | keyword_pid | keyword_keyword | cnt |
+------------+-------------+-----------------+-----+------------+-------------+-----------------+------+
| 3 | 1 | red | 3 | NULL | NULL | NULL | NULL |
| 1 | 1 | widgety | 3 | NULL | NULL | NULL | NULL |
| 9 | 2 | curve | 1 | 10 | 2 | red | 2 |
| 9 | 2 | curve | 1 | 6 | 2 | screwy | 3 |
| 10 | 2 | red | 2 | 6 | 2 | screwy | 3 |
| 6 | 2 | screwy | 3 | NULL | NULL | NULL | NULL |
| 12 | 3 | red | 1 | 7 | 3 | spike | 2 |
| 7 | 3 | spike | 2 | NULL | NULL | NULL | NULL |
+------------+-------------+-----------------+-----+------------+-------------+-----------------+------+
t2.cnt is NULL
行包含每个(keyword_pid,keyword_keyword)
对的最大计数的行(这是找到最大值的惯用语的一部分):
SELECT t1.*
FROM (SELECT k.*,
COUNT(k.keyword_keyword) cnt
FROM keywords k
GROUP BY k.keyword_pid,
k.keyword_keyword) t1
LEFT JOIN (SELECT k.*,
COUNT(k.keyword_keyword) cnt
FROM keywords k
GROUP BY k.keyword_pid,
k.keyword_keyword) t2
ON t1.keyword_pid = t2.keyword_pid
AND t1.cnt < t2.cnt
WHERE t2.cnt IS NULL
+------------+-------------+-----------------+-----+
| keyword_id | keyword_pid | keyword_keyword | cnt |
+------------+-------------+-----------------+-----+
| 3 | 1 | red | 3 |
| 1 | 1 | widgety | 3 |
| 6 | 2 | screwy | 3 |
| 7 | 3 | spike | 2 |
+------------+-------------+-----------------+-----+
其余相对容易。首先,我们加入产品表,以便我们可以看到哪些产品与哪个关键字相关联:
SELECT p.*,
t1.*
FROM (SELECT k.*,
COUNT(k.keyword_keyword) cnt
FROM keywords k
GROUP BY k.keyword_pid,
k.keyword_keyword) t1
LEFT JOIN (SELECT k.*,
COUNT(k.keyword_keyword) cnt
FROM keywords k
GROUP BY k.keyword_pid,
k.keyword_keyword) t2
ON t1.keyword_pid = t2.keyword_pid
AND t1.cnt < t2.cnt
LEFT JOIN product p
ON p.product_id = t1.keyword_pid
WHERE t2.cnt IS NULL
+------------+--------------+------------+-------------+-----------------+-----+
| product_id | product_name | keyword_id | keyword_pid | keyword_keyword | cnt |
+------------+--------------+------------+-------------+-----------------+-----+
| 1 | widget | 3 | 1 | red | 3 |
| 1 | widget | 1 | 1 | widgety | 3 |
| 2 | screw | 6 | 2 | screwy | 3 |
| 3 | nail | 7 | 3 | spike | 2 |
+------------+--------------+------------+-------------+-----------------+-----+
如果您想要关键关系,以上是解决方案。
如果你想摆脱关系(随机),你可以使用另一个GROUP BY
:
SELECT p.*,
t1.*
FROM (SELECT k.*,
COUNT(k.keyword_keyword) cnt
FROM keywords k
GROUP BY k.keyword_pid,
k.keyword_keyword) t1
LEFT JOIN (SELECT k.*,
COUNT(k.keyword_keyword) cnt
FROM keywords k
GROUP BY k.keyword_pid,
k.keyword_keyword) t2
ON t1.keyword_pid = t2.keyword_pid
AND t1.cnt < t2.cnt
LEFT JOIN product p
ON p.product_id = t1.keyword_pid
WHERE t2.cnt IS NULL
GROUP BY p.product_id
+------------+--------------+------------+-------------+-----------------+-----+
| product_id | product_name | keyword_id | keyword_pid | keyword_keyword | cnt |
+------------+--------------+------------+-------------+-----------------+-----+
| 1 | widget | 3 | 1 | red | 3 |
| 2 | screw | 6 | 2 | screwy | 3 |
| 3 | nail | 7 | 3 | spike | 2 |
+------------+--------------+------------+-------------+-----------------+-----+