如何在MySQL中获取X个唯一记录

时间:2019-07-03 04:18:33

标签: mysql sql

我想从一个表中获取10条记录,最多2条相同的用户记录。

mysql表包含来自用户的消息。我想获得唯一的用户消息,如果我只想要一条唯一的消息,这很容易,我可以使用distinct来获得它。但是我想要2条唯一的用户消息。

下表显示了原始数据。

--------------------------------------------------------------
| id | user_id | message                                     | 
--------------------------------------------------------------
| 1  | 111     | this is message A from user 1               |
--------------------------------------------------------------
| 2  | 111     | this is message B from user 1               |
--------------------------------------------------------------
| 3  | 111     | this is message C from user 1               |
--------------------------------------------------------------
| 4  | 222     | this is message A from user 2               |
--------------------------------------------------------------
| 5  | 222     | this is message B from user 2               |
--------------------------------------------------------------
| 6  | 222     | this is message C from user 2               |
--------------------------------------------------------------
| 7  | 333     | this is message A from user 3               |
--------------------------------------------------------------
| 8  | 333     | this is message B from user 3               |
--------------------------------------------------------------
| 9  | 333     | this is message C from user 3               |
--------------------------------------------------------------
... so on ...

现在我需要一个查询,该查询可以为每个用户带来2个结果,如下所示,最多10条记录:

--------------------------------------------------------------
| id | user_id | message                                     | 
--------------------------------------------------------------
| 1  | 111     | this is message A from user 1               |
--------------------------------------------------------------
| 2  | 111     | this is message B from user 1               |
--------------------------------------------------------------
| 4  | 222     | this is message A from user 2               |
--------------------------------------------------------------
| 5  | 222     | this is message B from user 2               |
--------------------------------------------------------------
| 7  | 333     | this is message A from user 3               |
--------------------------------------------------------------
| 8  | 333     | this is message B from user 3               |
--------------------------------------------------------------
... so on ...

编辑:

使用这样的查询来获取按user_id分组的记录只能带来单个记录:

select max(id) as id, user_id, max(message) as message from user_messages group by user_id
--------------------------------------------------------------
| id | user_id | message                                     | 
--------------------------------------------------------------
| 2  | 111     | this is message B from user 1               |
--------------------------------------------------------------
| 5  | 222     | this is message B from user 2               |
--------------------------------------------------------------
| 8  | 333     | this is message B from user 3               |
--------------------------------------------------------------
... so on ...

但是我找不到一种为每个用户获取2套记录的方法。

EDIT2:

使用编程语言的解决方法,我们可以执行以下操作:

- we need 10 records total
- we need 2 records max per user
- we can run a loop => 10 / 2 = 5 times
- each time we get a distinct user record
- each next time we append `id not in` to the query to avoid already loaded records

类似的东西:

$data = [];
$ids = [0]; // keep a value in it so that first query does not give error
for ($i=0; $i<5; $i++) {
  $res = mysql_query("select max(id) as id, user_id from user_messages where id not in (".implode(',', $ids).") group by user_id");
  while ( ($row = mysql_fetch_assoc($res)) ) {
    $ids[] = $row['id'];
    $data[] = $row;
  }
}

但这不是最佳解决方案,因为它涉及代码而不是纯sql。

1 个答案:

答案 0 :(得分:0)

在MySQL 8+中,您将使用row_number()

select um.*
from (select um.*,
             row_number() over (partition by user_id order by id) as seqnum
      from user_messages um
     ) um
where seqnum <= 2;

在早期版本中,您可以使用相关子查询:

select um.*
from user_messages um
where um.id <= any (select um2.id
                    from user_messages um2
                    where um2.user_id = um.user_id
                    order by um2.id
                    limit 2
                   );

顺便说一下,any处理为用户提供少于两条消息的情况。