SQLite:查找列中每个不同值的前n个聚合

时间:2014-01-10 17:55:56

标签: sql sqlite aggregate-functions

那不是一个非常明确的标题,是吗?

我有一个SQLite表results

event   | dayOfWeek | hour| eventCount
--------+-----------+-----+------------
Event A | 0         | 0   | 4926
Event A | 0         | 1   | 1492
...
Event A | 1         | 0   | 7372
Event A | 1         | 1   | 49
...
Event B | 0         | 0   | 234648
...

它只包含每个事件发生在每周每天每小时的时间。

我一直在建这样的表daily

create table daily as
select  event,
        sum(case when dayOfWeek = 0 then count else 0 end) as sunday,
        sum(case when dayOfWeek = 1 then count else 0 end) as monday,
        sum(case when dayOfWeek = 2 then count else 0 end) as tuesday,
        sum(case when dayOfWeek = 3 then count else 0 end) as wednesday,
        sum(case when dayOfWeek = 4 then count else 0 end) as thursday,
        sum(case when dayOfWeek = 5 then count else 0 end) as friday,
        sum(case when dayOfWeek = 6 then count else 0 end) as saturday
from results
group by event;

要获得一个如下所示的表:

event   |sunday|monday|tuesday|wednesday|thursday|friday|saturday
--------+------+------+-------+---------+--------+------+---------
Event A | 345  | 2345 | 341   | 568     | 689    | 2351 | 1455
...

其中只包含每周每天的每种事件类型的计数。为一天中的小时和日/小时建立一个类似的表格是微不足道的,我有两个表格。

我想像这样制作一张表topTenPerHour

hour | 1st     | 2nd     | 3rd     | ...
-----+---------+---------+---------+------
0    | Event A | Event C | Event B | ...
1    | Event B | Event D | Event C | ...
...
23   | Event A | Event R | Event D | ...

但我很难看到。有什么建议吗?

编辑:我实际上并不需要创建表(我只需要进行SELECT调用),因此SQLite对CREATE TABLE的限制(例如JOIN的不可用性)不适用于此问题。

2 个答案:

答案 0 :(得分:0)

您在此处以过于复杂的方式设置了数据库。

你应该:

  • 定义每个事件的EventType表
  • 一个EventLog表,它使用“EventTypeId”外键和时间戳记录每个单独的事件。

然后,您可以使用查询中的数据库函数来执行其他所有操作。尝试将所有这些信息存储在表中是多余的,因为它已经存在于其他表中。访问数据库的程序应该是调用正确的查询而不是数据库的工作来保存冗余信息。

如果您在静态数据上反复进行相同的查询(通常很少更新),通常只能按照现在设置的方式执行此操作。在这种情况下,您只会因为担心优化查询的运行时而使用它。

答案 1 :(得分:0)

正如您在其他查询中看到的那样,SQL并没有真正设计为具有多个具有相似含义的列;你最终会复制很多代码。

对于顶部的 n 值,我们需要计算它们的排名,在这种情况下,它是同一小时内没有较小事件数的记录数:

CREATE VIEW /* or TABLE */ ranks AS
SELECT hour,
       event,
       (SELECT COUNT(*)
        FROM results
        WHERE hour = hours.hour
          AND eventCount >= hours.eventCount
       ) AS rank
FROM (SELECT DISTINCT event,
                      hour
      FROM results) AS hours

然后我们从每个列的特定等级的记录中获取值:

SELECT hour,
       (SELECT event FROM ranks WHERE hour = h.hour AND rank = 1) AS "1st",
       (SELECT event FROM ranks WHERE hour = h.hour AND rank = 2) AS "2nd",
       ...
FROM (SELECT DISTINCT hour
      FROM results) AS h
ORDER BY hour