使用SQL - 从唯一的列组合中选择第X个最大值

时间:2017-09-28 15:56:19

标签: mysql sql database filtering

我一直坚持这个问题。我有以下结构的数据表:

 my_tbl:
 -------------------
     id: primary_key
   time: datetime
  asset: int  (number of in-game asset)
   data: char (data generated through in-game asset)
version: int  (version of the asset)

现在,从表中我想查询第X个最大版本的每个唯一时间的数据。

我已经开发了查询来获取每个资产的数据以及版本最大的每个唯一时间。

这是我的问题:

SELECT `asset`, `time`, `data`, `version` FROM `my_tbl`
INNER JOIN (
  SELECT MAX(version) as max_iter, `time` as t FROM `my_tbl`
  GROUP BY time
) AS B
ON (B.t = my_tbl.time AND B.max_iter = my_tbl.version) 
ORDER BY asset ASC;

现在我无法弄清楚如何获得第二大等等......

这是我的数据集:

CREATE TABLE IF NOT EXISTS `my_tbl` (
  `id` int unsigned NOT NULL,
  `time` DATETIME NOT NULL,
  `asset` int NOT NULL,
  `data` DECIMAL(7,2) NOT NULL,
  `version` int NOT NULL,
  PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
INSERT INTO `my_tbl` (`id`, `time`, `asset`, `data`, `version`) VALUES
  ( 1, '2017-11-01 10:00:00',1,   7.32, 1),
  ( 2, '2017-11-01 11:00:00',1,  10.32, 1),
  ( 3, '2017-11-01 12:00:00',1,   7.4 , 1),
  ( 4, '2017-11-01 11:00:00',1,   4.3 , 2),
  ( 5, '2017-11-01 12:00:00',1,   4.4 , 2),
  ( 6, '2017-11-01 13:00:00',1,   4.6 , 2),
  ( 7, '2017-11-01 12:00:00',1,   8.3 , 3),
  ( 8, '2017-11-01 13:00:00',1,   8.4 , 3),
  ( 9, '2017-11-01 14:00:00',1,   8.6 , 3),
  (10, '2017-11-01 13:00:00',1,   9.3 , 4),
  (11, '2017-11-01 14:00:00',1,   9.4 , 4),
  (12, '2017-11-01 15:00:00',1,   9.6 , 4),
  (13, '2017-11-01 10:00:00',2,  70   , 1),
  (14, '2017-11-01 11:00:00',2, 100   , 1),
  (15, '2017-11-01 12:00:00',2,  74   , 1),
  (16, '2017-11-01 11:00:00',2,  43   , 2),
  (17, '2017-11-01 12:00:00',2,  44   , 2),
  (18, '2017-11-01 13:00:00',2,  46   , 2),
  (19, '2017-11-01 12:00:00',2,  83   , 3),
  (20, '2017-11-01 13:00:00',2,  84   , 3),
  (21, '2017-11-01 14:00:00',2,  86   , 3),
  (22, '2017-11-01 13:00:00',2,  93   , 4),
  (23, '2017-11-01 14:00:00',2,  94   , 4),
  (24, '2017-11-01 15:00:00',2,  96   , 4),
  (25, '2017-11-01 15:00:00',3,  96   , 4); 

以下是找到最大的小提琴的链接:

https://www.db-fiddle.com/f/ggyHLAzbLpWNWwVNaZPJuM/2

对于(2)nd最大值,结果应如下所示:

+----+---------------------+-------+--------+---------+
| id | time                | asset | data   | version |
+----+---------------------+-------+--------+---------+
|  1 | 2017-11-01 10:00:00 |     1 |   7.32 |       1 |
|  4 | 2017-11-01 11:00:00 |     1 |   4.30 |       2 |
|  5 | 2017-11-01 12:00:00 |     1 |   4.40 |       2 |
|  8 | 2017-11-01 13:00:00 |     1 |   8.40 |       3 |
| 11 | 2017-11-01 14:00:00 |     1 |   9.40 |       4 |
| 12 | 2017-11-01 15:00:00 |     1 |   9.60 |       4 |
| 13 | 2017-11-01 10:00:00 |     2 |  70.00 |       1 |
| 16 | 2017-11-01 11:00:00 |     2 |  43.00 |       2 |
| 17 | 2017-11-01 12:00:00 |     2 |  44.00 |       2 |
| 20 | 2017-11-01 13:00:00 |     2 |  84.00 |       3 |
| 23 | 2017-11-01 14:00:00 |     2 |  94.00 |       4 |
| 24 | 2017-11-01 15:00:00 |     2 |  96.00 |       4 |
| 25 | 2017-11-01 15:00:00 |     3 |  96.00 |       4 |

2 个答案:

答案 0 :(得分:0)

在尝试回答之前,我认为您的说明和数据集不匹配。如果描述正确,那么我会预期以下结果:

+----+---------------------+-------+--------+---------+
| id | time                | asset | data   | version |
+----+---------------------+-------+--------+---------+
|  1 | 2017-11-01 10:00:00 |     1 |   7.32 |       1 |
|  4 | 2017-11-01 11:00:00 |     1 |   4.30 |       2 |
|  5 | 2017-11-01 12:00:00 |     1 |   4.40 |       2 |
|  8 | 2017-11-01 13:00:00 |     1 |   8.40 |       3 |
| 11 | 2017-11-01 14:00:00 |     1 |   9.40 |       4 |
| 12 | 2017-11-01 15:00:00 |     1 |   9.60 |       4 |
| 13 | 2017-11-01 10:00:00 |     2 |  70.00 |       1 |
| 16 | 2017-11-01 11:00:00 |     2 |  43.00 |       2 |
| 17 | 2017-11-01 12:00:00 |     2 |  44.00 |       2 |
| 20 | 2017-11-01 13:00:00 |     2 |  84.00 |       3 |
| 23 | 2017-11-01 14:00:00 |     2 |  94.00 |       4 |
| 24 | 2017-11-01 15:00:00 |     2 |  96.00 |       4 |
| 25 | 2017-11-01 15:00:00 |     3 |  96.00 |       4 |

假设这是正确的,那么我认为这将返回你所追求的......

SELECT m.*
  FROM my_tbl m
  JOIN
     ( SELECT a.time
            , a.asset
            , MAX(a.version) version
         FROM
            ( SELECT time
                   , asset
                   , version
                   , CASE WHEN @time = time
                          THEN CASE WHEN @asset = asset
                                    THEN @i:=@i+1 ELSE @i:=1 END
                                    ELSE @i:=1 END i
                   , @time :=time
                   , @asset := asset
                FROM my_tbl
                   , (SELECT @time:=null, @asset:=null,@i:=0) vars
               ORDER
                  BY time
                   , asset
                   , version
             ) a
         WHERE i <= 2
         GROUP
            BY time, asset
     ) n
    ON n.time = m.time
   AND n.asset = m.asset
   AND n.version = m.version
 ORDER 
    BY m.id;

这假设一个自然键(时间,资产,版本)

编辑:

对于i&lt; = 3,我们希望得到以下结果(所需的行用'&lt; - '...突出显示

SELECT * FROM my_tbl ORDER BY time, asset, version;
+----+---------------------+-------+--------+---------+
| id | time                | asset | data   | version |
+----+---------------------+-------+--------+---------+
|  1 | 2017-11-01 10:00:00 |     1 |   7.32 |       1 |<--

| 13 | 2017-11-01 10:00:00 |     2 |  70.00 |       1 |<--

|  2 | 2017-11-01 11:00:00 |     1 |  10.32 |       1 |
|  4 | 2017-11-01 11:00:00 |     1 |   4.30 |       2 |<--

| 14 | 2017-11-01 11:00:00 |     2 | 100.00 |       1 |
| 16 | 2017-11-01 11:00:00 |     2 |  43.00 |       2 |<--

|  3 | 2017-11-01 12:00:00 |     1 |   7.40 |       1 |
|  5 | 2017-11-01 12:00:00 |     1 |   4.40 |       2 |
|  7 | 2017-11-01 12:00:00 |     1 |   8.30 |       3 |<--

| 15 | 2017-11-01 12:00:00 |     2 |  74.00 |       1 |
| 17 | 2017-11-01 12:00:00 |     2 |  44.00 |       2 |
| 19 | 2017-11-01 12:00:00 |     2 |  83.00 |       3 |<--

|  6 | 2017-11-01 13:00:00 |     1 |   4.60 |       2 |
|  8 | 2017-11-01 13:00:00 |     1 |   8.40 |       3 |
| 10 | 2017-11-01 13:00:00 |     1 |   9.30 |       4 |<--

| 18 | 2017-11-01 13:00:00 |     2 |  46.00 |       2 |
| 20 | 2017-11-01 13:00:00 |     2 |  84.00 |       3 |
| 22 | 2017-11-01 13:00:00 |     2 |  93.00 |       4 |<--

|  9 | 2017-11-01 14:00:00 |     1 |   8.60 |       3 |
| 11 | 2017-11-01 14:00:00 |     1 |   9.40 |       4 |<--

| 21 | 2017-11-01 14:00:00 |     2 |  86.00 |       3 |
| 23 | 2017-11-01 14:00:00 |     2 |  94.00 |       4 |<--

| 12 | 2017-11-01 15:00:00 |     1 |   9.60 |       4 |<--

| 24 | 2017-11-01 15:00:00 |     2 |  96.00 |       4 |<--

| 25 | 2017-11-01 15:00:00 |     3 |  96.00 |       4 |<--
+----+---------------------+-------+--------+---------+

实际上,用'i&lt; = 3'代替'i&lt; = 2'会返回此结果集。

答案 1 :(得分:0)

如果您使用的dbms支持ROW_NUMBER()OVER(),那么我会使用它。

SELECT *
FROM (
      SELECT
            , asset
            , time
            , data
            , version 
            , row_number() over(partition by Time order by asset DESC) as RowNumber
      FROM my_tbl
     ) d
WHERE RowNumber = 3
;

如果您使用的MySQL版本不支持,那么我会这样做:

SELECT *
FROM (
      SELECT
              @row_num :=IF(@prev_value=`time`,@row_num+1,1)AS RowNumber
            , `asset`
            , `time`
            , `data`
            , `version` 
            , @prev_value := `time`
      FROM `my_tbl`
      CROSS JOIN (
                  SELECT @row_num :=1,  @prev_value :=''
                  ) vars
      ORDER BY
              `time`
            , `asset` DESC
     ) d
WHERE RowNumber = 3
;

现在,您应该可以在任何时间点找到第N个最大的资产。

注意:因为您正在处理日期/时间,并且可能精确到毫秒,您可能需要将时间“截断”为更大的时间单位(例如我们在示例数据中看到的小时)

NB;如果您的dbms支持DENSE_RANK()OVER()然后使用它可能是一个更好的解决方案,因为它将返回“等于第二”或“相等的第三”等,但row_number()将不会这样做。