选择多个行,重复项限于唯一的元行MySQL

时间:2014-10-14 21:59:53

标签: mysql

我需要对MySQL数据库进行非常具体的查询,我将从表示例开始:

+----+---------------+------------------------------------+----------+
| id | data          | pattern_key                        | hash     |
+----+---------------+------------------------------------+----------+
|  1 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  2 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  3 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  4 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | HGFEDCBA |
|  5 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | HGFEDCBA |
|  6 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | OPQRSTUW |
|  7 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  8 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  9 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | IJKLMNOP |
| 10 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
+----+---------------+------------------------------------+----------+

正如您所看到的,我有列数据和pattern_key,这在当下是无关紧要的。重要的是哈希列,它允许相同的值,即:ABCDEFGH。我想要做的是选择按ID递减排序的5行,但是在哈希列上包含重复项,并且只有当它们在彼此之后。查询此表的结果应为:

+----+---------------+------------------------------------+----------+
| id | data          | pattern_key                        | hash     |
+----+---------------+------------------------------------+----------+
| 10 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  9 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | IJKLMNOP |
|  8 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  7 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  6 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | OPQRSTUW |
|  5 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | HGFEDCBA |
+----+---------------+------------------------------------+----------+

我们有6条记录,而不是5条记录,但是包含第7行,因为第8行具有相同的哈希值。只要它们在彼此之后,这种行为就必须忽略重复的数量,所以如果我们按id升序排序,我们就会得到:

+----+---------------+------------------------------------+----------+
| id | data          | pattern_key                        | hash     |
+----+---------------+------------------------------------+----------+
|  1 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  2 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  3 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  4 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | HGFEDCBA |
|  5 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | HGFEDCBA |
|  6 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | OPQRSTUW |
|  7 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  8 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | ABCDEFGH |
|  9 | {"user":true} | NOTIFICATIONHUB::SYSTEM            | IJKLMNOP |
+----+---------------+------------------------------------+----------+

因为我们有3个ABCDEFGHs(我认为1个独特的元行),2个HGFEDCBA(2个独特的元行),1个OPQRSTUW(第3个独特行),2个ABCDEFGHs(4个独特的metarow,因为那些分为前3个ABCDEFGHs其他哈希)和一个IJKLMNOP。

我正在考虑分组,但它会重复,我希望它们包含在数据集中。有什么想法吗?

感谢@Uueerdo我已经提出了解决方案(是的,无论如何都需要JOIN):

SET @i := 0;
SET @lastHash := '';
SELECT *
FROM
  (SELECT notification_real_id AS id, data, pattern_key, @i := IF(hash <> @lastHash, @i + 1, @i) AS hashGroup, @lastHash := hash AS hash
   FROM
     ( SELECT notifications.id AS notification_real_id,
              data,
              pattern_key,
              hash
      FROM notifications
      INNER JOIN notifications_users ON notifications.id = notifications_users.notification_id
      WHERE notifications_users.user_id = 1) AS subJoin
   ORDER BY notification_real_id DESC) AS subQ
WHERE hashGroup <= 5;

1 个答案:

答案 0 :(得分:6)

我没有时间验证它是否完全正确,但这样的事情应该有效:

SET @i := 0;
SET @lastHash := '';
SELECT * 
FROM (
  SELECT id, data, pattern_key
   , @i := IF(hash <> @lastHash, @i + 1, @i) AS hashGroup
   , @lastHash := hash
  FROM table
  ORDER BY id DESC
) AS subQ
WHERE hashGroup <= 5
;

或者(不需要变量)(不推荐:没有数据攻击时不可靠,不像连接那样灵活,而且很可能慢得多)

SELECT * 
FROM the_table  
WHERE id > (
   SELECT MAX(id) AS lastID    
   FROM the_table
   GROUP BY hash 
   ORDER BY lastID DESC LIMIT 5, 1
  )
;