为表中的每个ID选择最新的3条记录

时间:2015-06-05 22:29:40

标签: sql sqlite group-by sql-order-by greatest-n-per-group

我的表格中包含复合主键(IDDate),如下所示。

+------+------------+-------+
|  ID  |    Date    | Value |
+------+------------+-------+
|   1  | 1433419200 |   15  |
|   1  | 1433332800 |   23  |
|   1  | 1433246400 |   41  |
|   1  | 1433160000 |   55  |
|   1  | 1432900800 |   24  |
|   2  | 1433419200 |   52  |
|   2  | 1433332800 |   23  |
|   2  | 1433246400 |   39  |
|   2  | 1433160000 |   22  |
|   3  | 1433419200 |   11  |
|   3  | 1433246400 |   58  |
|  ... |    ...     |  ...  |
+------+------------+-------+

Date列上还有一个单独的索引。桌子大小适中,目前行约600k,每天增长约2k。

我想做一个SELECT查询,它返回每个Date的最新3条记录(按ID时间戳排序)。对于每个给定的IDDate值始终是唯一的,因此无需担心此处Date的关联。

我尝试了一种受this answer启发的自我加入方式,但是花了几秒钟才运行并且什么也没有返回:

SELECT p1.ID, p1.Date, p1.Value FROM MyTable AS p1
LEFT JOIN MyTable AS p2 
ON p1.ID=p2.ID AND p1.Date<=p2.Date
GROUP BY p1.ID
HAVING COUNT(*)<=5
ORDER BY p1.ID, p1.Date DESC;

这里有什么快速解决方案?

3 个答案:

答案 0 :(得分:11)

您可以查找每个ID的最近三个日期:

SELECT ID, Date, Value
FROM MyTable
WHERE Date IN (SELECT Date
               FROM MyTable AS T2
               WHERE T2.ID = MyTable.ID
               ORDER BY Date DESC
               LIMIT 3)

或者,查找每个ID的第三个最近日期,并将其用作限制:

SELECT ID, Date, Value
FROM MyTable
WHERE Date >= IFNULL((SELECT Date
                      FROM MyTable AS T2
                      WHERE T2.ID = MyTable.ID
                      ORDER BY Date DESC
                      LIMIT 1 OFFSET 2),
                     0)

两个查询都应该从主键索引中获得良好的性能。

答案 1 :(得分:2)

首先,这是对不等式方法的正确查询:

SELECT p1.ID, p1.Date, p1.Value
FROM MyTable p1 LEFT JOIN
     MyTable AS p2 
     ON p1.ID = p2.ID AND p2.Date <= p1.Date
--------------------------^ fixed this condition
GROUP BY p1.ID, p1.Date, p1.Value
HAVING COUNT(*) <= 5
ORDER BY p1.ID, p1.Date DESC;

我不确定在SQLite中是否有快速的方法可以做到这一点。在大多数其他数据库中,您可以使用ANSI标准row_number()函数。在MySQL中,您可以使用变量。这两个在SQLite中都很难。您最好的解决方案可能是使用光标。

以上内容可以从MyTable(Id, Date)上的索引中受益。

答案 2 :(得分:0)

SELECT distinct x.ID,x.Date,X.Value
FROM ( SELECT DISTINCT ID FROM XXXTable  ) c
    CROSS APPLY (

    select top 3 A.ID,a.Date,Value,[Count] from (
    SELECT distinct ID,Date,Value, ROW_NUMBER()
    over (
        PARTITION BY ID
        order by Date
    ) AS [Count]  where c.ID = t.ID


    ) A  order by [Count] desc