我的表格中包含复合主键(ID
,Date
),如下所示。
+------+------------+-------+ | ID | Date | Value | +------+------------+-------+ | 1 | 1433419200 | 15 | | 1 | 1433332800 | 23 | | 1 | 1433246400 | 41 | | 1 | 1433160000 | 55 | | 1 | 1432900800 | 24 | | 2 | 1433419200 | 52 | | 2 | 1433332800 | 23 | | 2 | 1433246400 | 39 | | 2 | 1433160000 | 22 | | 3 | 1433419200 | 11 | | 3 | 1433246400 | 58 | | ... | ... | ... | +------+------------+-------+
Date
列上还有一个单独的索引。桌子大小适中,目前行约600k,每天增长约2k。
我想做一个SELECT查询,它返回每个Date
的最新3条记录(按ID
时间戳排序)。对于每个给定的ID
,Date
值始终是唯一的,因此无需担心此处Date
的关联。
我尝试了一种受this answer启发的自我加入方式,但是花了几秒钟才运行并且什么也没有返回:
SELECT p1.ID, p1.Date, p1.Value FROM MyTable AS p1
LEFT JOIN MyTable AS p2
ON p1.ID=p2.ID AND p1.Date<=p2.Date
GROUP BY p1.ID
HAVING COUNT(*)<=5
ORDER BY p1.ID, p1.Date DESC;
这里有什么快速解决方案?
答案 0 :(得分:11)
您可以查找每个ID的最近三个日期:
SELECT ID, Date, Value
FROM MyTable
WHERE Date IN (SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 3)
或者,查找每个ID的第三个最近日期,并将其用作限制:
SELECT ID, Date, Value
FROM MyTable
WHERE Date >= IFNULL((SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 1 OFFSET 2),
0)
两个查询都应该从主键索引中获得良好的性能。
答案 1 :(得分:2)
首先,这是对不等式方法的正确查询:
SELECT p1.ID, p1.Date, p1.Value
FROM MyTable p1 LEFT JOIN
MyTable AS p2
ON p1.ID = p2.ID AND p2.Date <= p1.Date
--------------------------^ fixed this condition
GROUP BY p1.ID, p1.Date, p1.Value
HAVING COUNT(*) <= 5
ORDER BY p1.ID, p1.Date DESC;
我不确定在SQLite中是否有快速的方法可以做到这一点。在大多数其他数据库中,您可以使用ANSI标准row_number()
函数。在MySQL中,您可以使用变量。这两个在SQLite中都很难。您最好的解决方案可能是使用光标。
以上内容可以从MyTable(Id, Date)
上的索引中受益。
答案 2 :(得分:0)
SELECT distinct x.ID,x.Date,X.Value
FROM ( SELECT DISTINCT ID FROM XXXTable ) c
CROSS APPLY (
select top 3 A.ID,a.Date,Value,[Count] from (
SELECT distinct ID,Date,Value, ROW_NUMBER()
over (
PARTITION BY ID
order by Date
) AS [Count] where c.ID = t.ID
) A order by [Count] desc