Question

我在SQLite中有一个表

/* Create a table called NAMES */
CREATE TABLE EVENTS(Id integer , Eventtype integer,value integer,Timestamp DATETIME);

/* Create few records in this table */
INSERT INTO EVENTS VALUES(1,2,1,'2009-01-01 10:00:00');  --ROW1
INSERT INTO EVENTS VALUES(1,2,2,'2007-01-01 10:00:00');  --ROW2
INSERT INTO EVENTS VALUES(2,2,3,'2008-01-01 10:00:00’);  --ROW3

查询需要的是ROW1和ROW3。该查询应该采用最新的基于行的时间戳来表示重复的ID＆amp;事件类型组合。 ROW1和ROW2具有相同的eventtype和id，但ROW1是最新的，所以应该选择它。

Answer 1

在SQLite 3.7.11或更高版本中，您可以使用GROUP BY和MAX（）来选择要返回的组中的哪一行：

SELECT *, MAX(timestamp)
FROM events
GROUP BY id, eventtype

在早期版本中，您必须使用子查询查找组中最大行的唯一ID（如您的回答所示）。

Answer 2

我从以下链接获得了帮助： sqlite equivalent of row_number() over ( partition by ...?

以下是我的想法：

select * from events E1 where timestamp in
(select timestamp from events E2 where E2.id = E1.id and E2.eventtype=E1.eventtype
                         order by E2.timestamp desc
                         LIMIT 1  );

同样使用SQL SERVER，我正在考虑这个解决方案（因为我无法测试）

select id,eventtype,value,ROW_NUMBER() over 
(PARTITION BY id,eventtype,order by timestamp desc) AS RN  from events where RN<=1 ;

Answer 3

我对这个问题有点迟了，但我对目前的答案并不满意，因为他们大多使用this SO answer，这会严重破坏性能。

在许多情况下，可以使用标准连接模拟单列分析函数：

SELECT e.*
FROM events e
JOIN
(
    -- Our simulated window with analytical result
    SELECT id, eventtype, MAX(timestamp) AS timestamp
    FROM events
    GROUP BY id, eventtype
) win
USING (id, eventtype, timestamp)

一般来说，模式是：

SELECT main.*
FROM main
JOIN
(
    SELECT
        partition_columns,
        FUNCTION(analyzed_column) AS analyzed_column
    FROM main
    GROUP BY partition_columns
) win
USING (partition_columns, analyzed_column)

这些模拟窗口并不完美：

如果您的数据与分析的列结果有关联，那么您可能需要从结果集中删除重复项。否则，您将从分区中选择与您分析的列结果匹配的每个行。
如果分析函数需要按多列排序，则需要使用相关子查询。可以修改其他答案以达到预期效果。

SQLITE相当于ROW_NUMBER

3 个答案: