选择与每个键关联的10条记录

时间:2015-11-21 11:47:57

标签: mysql

以下是我有两个表格标签和客户作为以下结构

的情况
Tags Table
ID Name   
1  Tag1
2  Tag2

Customers Table
ID Tag_ID Name
1  1      C1
2  2      C2
3  1      C3

我想要一个SQL语句来获取每个标签的前10个客户(按字母顺序排列)?是否可以在一个查询中完成。

P.S表中的数据是样本数据而不是实际数据

3 个答案:

答案 0 :(得分:3)

请考虑以下事项:

DROP TABLE IF EXISTS tags;

CREATE TABLE tags 
(tag_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY 
,name VARCHAR(12) NOT NULL
);

INSERT INTO tags VALUES
(1,'One'),
(2,'Two'),
(3,'Three'),
(4,'Four'),
(5,'Five'),
(6,'Six');

DROP TABLE IF EXISTS customers;

CREATE TABLE customers  
(customer_id INT NOT NULL
,customer VARCHAR(12)
);

INSERT INTO customers VALUES
(1,'Dave'),
(2,'Ben'),
(3,'Charlie'),
(4,'Michael'),
(5,'Steve'),
(6,'Clive'),
(7,'Alice'),
(8,'Ken'),
(9,'Petra');

DROP TABLE IF EXISTS customer_tag;

CREATE TABLE customer_tag
(customer_id INT NOT NULL
,tag_ID INT NOT NULL
,PRIMARY KEY(customer_id,tag_id)
);

INSERT INTO customer_tag VALUES
(1,1),
(1,2),
(1,4),
(2,3),
(2,2),
(3,1),
(4,4),
(4,2),
(5,2),
(5,5),
(5,6),
(6,6);

以下查询返回与每个标记关联的所有客户,以及按字母顺序排序时各自的“排名”...

SELECT t.*, c1.*, COUNT(ct2.tag_id) rank
  FROM tags t
  JOIN customer_tag ct1 
    ON ct1.tag_id = t.tag_id
  JOIN customers c1 
    ON c1.customer_id = ct1.customer_id 
  JOIN customer_tag ct2 
    ON ct2.tag_id = ct1.tag_id 
  JOIN customers c2 
    ON c2.customer_id = ct2.customer_id 
   AND c2.customer <= c1.customer 
 GROUP 
    BY t.tag_id, c1.customer_id
 ORDER 
    BY t.tag_id,rank;
+--------+-------+-------------+----------+------+
| tag_id | name  | customer_id | customer | rank |
+--------+-------+-------------+----------+------+
|      1 | One   |           3 | Charlie  |    1 |
|      1 | One   |           1 | Dave     |    2 |
|      2 | Two   |           2 | Ben      |    1 |
|      2 | Two   |           1 | Dave     |    2 |
|      2 | Two   |           4 | Michael  |    3 |
|      2 | Two   |           5 | Steve    |    4 |
|      3 | Three |           2 | Ben      |    1 |
|      4 | Four  |           1 | Dave     |    1 |
|      4 | Four  |           4 | Michael  |    2 |
|      5 | Five  |           5 | Steve    |    1 |
|      6 | Six   |           6 | Clive    |    1 |
|      6 | Six   |           5 | Steve    |    2 |
+--------+-------+-------------+----------+------+

如果我们只想要每个标签的前2位,我们可以重写如下......

SELECT t.*  
     , c1.*
  FROM tags t
  JOIN customer_tag ct1 
    ON ct1.tag_id = t.tag_id
  JOIN customers c1 
    ON c1.customer_id = ct1.customer_id 
  JOIN customer_tag ct2 
    ON ct2.tag_id = ct1.tag_id 
  JOIN customers c2 
    ON c2.customer_id = ct2.customer_id 
   AND c2.customer <= c1.customer 
 GROUP 
    BY t.tag_id, c1.customer_id
HAVING COUNT(ct2.tag_id) <=2
 ORDER 
   BY t.tag_id, c1.customer;
+--------+-------+-------------+----------+
| tag_id | name  | customer_id | customer |
+--------+-------+-------------+----------+
|      1 | One   |           3 | Charlie  |
|      1 | One   |           1 | Dave     |
|      2 | Two   |           2 | Ben      |
|      2 | Two   |           1 | Dave     |
|      3 | Three |           2 | Ben      |
|      4 | Four  |           1 | Dave     |
|      4 | Four  |           4 | Michael  |
|      5 | Five  |           5 | Steve    |
|      6 | Six   |           6 | Clive    |
|      6 | Six   |           5 | Steve    |
+--------+-------+-------------+----------+

这很好,但是在性能问题的情况下,像下面这样的解决方案会更快 - 尽管您可能需要在构建表之前运行SET NAMES utf8;(因为我必须)以便它工作正常:

SELECT tag_id, name, customer_id,customer 
  FROM
     (
       SELECT t.*
            , c.*
            , CASE WHEN @prev=t.tag_id THEN @i:=@i+1 ELSE @i:=1 END rank
            , @prev := t.tag_id
         FROM tags t
         JOIN customer_tag ct
           ON ct.tag_id = t.tag_id
         JOIN customers c
           ON c.customer_id = ct.customer_id
         JOIN ( SELECT @i:=1, @prev:=0) vars
        ORDER
           BY t.tag_id
            , c.customer
     ) x
 WHERE rank <=2
 ORDER 
    BY tag_id,customer;
+--------+-------+-------------+----------+
| tag_id | name  | customer_id | customer |
+--------+-------+-------------+----------+
|      1 | One   |           3 | Charlie  |
|      1 | One   |           1 | Dave     |
|      2 | Two   |           2 | Ben      |
|      2 | Two   |           1 | Dave     |
|      3 | Three |           2 | Ben      |
|      4 | Four  |           1 | Dave     |
|      4 | Four  |           4 | Michael  |
|      5 | Five  |           5 | Steve    |
|      6 | Six   |           6 | Clive    |
|      6 | Six   |           5 | Steve    |
+--------+-------+-------------+----------+

答案 1 :(得分:1)

为实现这一目标,我们必须使用两个会话变量,一个用于行号,另一个用于存储旧的客户ID,以便将其与当前的客户ID进行比较,作为以下查询:

select c.name, @row_number:=CASE
    WHEN @cid = c.id THEN @row_number + 1
    ELSE 1
END AS rows,
@id:=c.id as CustomerId from tags t, customers c where t.id=c.id group by c.name where Rows<=10

我们在查询中使用了CASE语句。如果客户编号保持不变,我们会增加row_number变量

Reference

答案 2 :(得分:0)

你的问题让我想起了this one(尤其是最高投票的答案),所以我想出了这个问题:

SELECT Tags.ID,
       Tags.Name,
       SUBSTRING_INDEX(GROUP_CONCAT(Customers.Name
                                    ORDER BY Customers.Name),
                       ',', 10) AS Customers
FROM Customers
INNER JOIN Tags
ON Tags.ID = Customers.Tag_ID
GROUP BY Tags.ID
ORDER BY Tags.Id;

It works,但这显然是一种hacky方式,因为MySQL不提供工具来更自然地做到这一点。