按小时间隔分组

时间:2014-03-09 14:12:50

标签: mysql sql

我很幸运能在Stack Overflow上找到这段令人敬畏的代码,但是我想改变它所以它显示每半小时而不是每小时,但是搞乱它,只会导致我破坏查询哈哈

这是SQL:

SELECT CONCAT(HOUR(created_at), ':00-', HOUR(created_at)+1, ':00') as hours,
       COUNT(*)
FROM urls
GROUP BY HOUR(created_at)
ORDER BY HOUR(created_at) ASC

我如何每半小时获得一次结果? :)

另一件事是,如果它有半小时没有结果,我希望它返回0而不是仅仅跳过那一步。它看起来有点奇怪的胜利我对查询做了统计,当它只跳过一个小时因为没有:P

6 个答案:

答案 0 :(得分:5)

如果格式不太重要,则可以为间隔返回两列。您甚至可能只需要间隔的开始,可以通过以下方式确定:

date_format(created_at - interval minute(created_at)%30 minute, '%H:%i') as period_start

别名可以在GROUP BY和ORDER BY子句中使用。如果您还需要间隔结束,则需要进行一些小修改:

SELECT
  date_format(created_at - interval minute(created_at)%30 minute, '%H:%i') as period_start,
  date_format(created_at + interval 30-minute(created_at)%30 minute, '%H:%i') as period_end,
  COUNT(*)
FROM urls
GROUP BY period_start
ORDER BY period_start ASC;

当然,您也可以连接值:

SELECT concat_ws('-',
           date_format(created_at - interval minute(created_at)%30 minute, '%H:%i'),
           date_format(created_at + interval 30-minute(created_at)%30 minute, '%H:%i')
       ) as period,
       COUNT(*)
FROM urls
GROUP BY period
ORDER BY period ASC;

演示:http://rextester.com/RPN50688

  

另一件事是,如果它有半个小时没有结果,我   希望它返回0

如果以过程语言使用结果,则可以在循环中初始化所有48行,然后从结果中“注入”非零行。

然而 - 如果你需要在SQL中完成它,你需要一个表至LEFT JOIN,至少有48行。这可以用“巨大的”UNION ALL语句内联完成,但(恕我直言)这将是丑陋的。所以我更喜欢让序列表有一个整数列,这对于报告非常有用。要创建该表,我通常使用information_schema.COLUMNS,因为它可以在任何MySQL服务器上使用,并且至少有几百行。如果您需要更多行 - 只需将其与自身连接即可。

现在让我们创建该表:

drop table if exists helper_seq;
create table helper_seq (seq smallint auto_increment primary key)
    select null
    from information_schema.COLUMNS c1
       , information_schema.COLUMNS c2
    limit 100; -- adjust as needed

现在我们有一个从1到100的整数表(虽然现在你只需要48 - 但这是为了演示)。

使用该表我们现在可以创建所有48个时间间隔:

select time(0) + interval 30*(seq-1) minute as period_start,
       time(0) + interval 30*(seq)   minute as period_end
from helper_seq s
where s.seq <= 48;

我们将得到以下结果:

period_start | period_end
    00:00:00 |   00:30:00
    00:30:00 |   01:00:00
...
   23:30:00  |   24:00:00

演示:http://rextester.com/ISQSU31450

现在我们可以将它用作派生表(FROM子句中的子查询)和LEFT JOIN urls表:

select p.period_start, p.period_end, count(u.created_at) as cnt
from (
    select time(0) + interval 30*(seq-1) minute as period_start,
           time(0) + interval 30*(seq)   minute as period_end
    from helper_seq s
    where s.seq <= 48
) p
left join urls u
    on  time(u.created_at) >= p.period_start
    and time(u.created_at) <  p.period_end
group by p.period_start, p.period_end
order by p.period_start

演示:http://rextester.com/IQYQ32927

最后一步(如果真的需要)是格式化结果。我们可以在外部选择中使用CONCATCONCAT_WSTIME_FORMAT。最终的查询是:

select concat_ws('-',
         time_format(p.period_start, '%H:%i'),
         time_format(p.period_end,   '%H:%i')
       ) as period,
       count(u.created_at) as cnt
from (
    select time(0) + interval 30*(seq-1) minute as period_start,
           time(0) + interval 30*(seq)   minute as period_end
    from helper_seq s
    where s.seq <= 48
) p
left join urls u
    on  time(u.created_at) >= p.period_start
    and time(u.created_at) <  p.period_end
group by p.period_start, p.period_end
order by p.period_start

结果如下:

period      | cnt
00:00-00:30 |   1
00:30-01:00 |   0
...
23:30-24:00 |   3

演示:http://rextester.com/LLZ41445

答案 1 :(得分:2)

嗯,这可能有点冗长,但它确实有效:

SELECT hours, SUM(count) as count FROM (
    SELECT CONCAT(HOUR(created_at), ':', LPAD(30 * FLOOR(MINUTE(created_at)/30), 2, '0'), '-',
                  HOUR(DATE_ADD(created_at, INTERVAL 30 minute)), ':', LPAD(30 * FLOOR(MINUTE(DATE_ADD(created_at, INTERVAL 30 minute))/30), 2, '0')) as hours,
           COUNT(*) as count
    FROM urls
    GROUP BY HOUR(created_at), FLOOR(MINUTE(created_at)/30)

    UNION ALL

    SELECT '00:00-00:30'as hours, 0 as count UNION ALL SELECT '00:30-01:00'as hours, 0 as count UNION ALL 
    SELECT '01:00-01:30'as hours, 0 as count UNION ALL SELECT '01:30-02:00'as hours, 0 as count UNION ALL 
    SELECT '02:00-02:30'as hours, 0 as count UNION ALL SELECT '02:30-03:00'as hours, 0 as count UNION ALL 
    SELECT '03:00-03:30'as hours, 0 as count UNION ALL SELECT '03:30-04:00'as hours, 0 as count UNION ALL 
    SELECT '04:00-04:30'as hours, 0 as count UNION ALL SELECT '04:30-05:00'as hours, 0 as count UNION ALL 
    SELECT '05:00-05:30'as hours, 0 as count UNION ALL SELECT '05:30-06:00'as hours, 0 as count UNION ALL 
    SELECT '06:00-06:30'as hours, 0 as count UNION ALL SELECT '06:30-07:00'as hours, 0 as count UNION ALL 
    SELECT '07:00-07:30'as hours, 0 as count UNION ALL SELECT '07:30-08:00'as hours, 0 as count UNION ALL 
    SELECT '08:00-08:30'as hours, 0 as count UNION ALL SELECT '08:30-09:00'as hours, 0 as count UNION ALL 
    SELECT '09:00-09:30'as hours, 0 as count UNION ALL SELECT '09:30-10:00'as hours, 0 as count UNION ALL 
    SELECT '10:00-10:30'as hours, 0 as count UNION ALL SELECT '10:30-11:00'as hours, 0 as count UNION ALL 
    SELECT '11:00-11:30'as hours, 0 as count UNION ALL SELECT '11:30-12:00'as hours, 0 as count UNION ALL 
    SELECT '12:00-12:30'as hours, 0 as count UNION ALL SELECT '12:30-13:00'as hours, 0 as count UNION ALL 
    SELECT '13:00-13:30'as hours, 0 as count UNION ALL SELECT '13:30-14:00'as hours, 0 as count UNION ALL 
    SELECT '14:00-14:30'as hours, 0 as count UNION ALL SELECT '14:30-15:00'as hours, 0 as count UNION ALL 
    SELECT '15:00-15:30'as hours, 0 as count UNION ALL SELECT '15:30-16:00'as hours, 0 as count UNION ALL 
    SELECT '16:00-16:30'as hours, 0 as count UNION ALL SELECT '16:30-17:00'as hours, 0 as count UNION ALL 
    SELECT '17:00-17:30'as hours, 0 as count UNION ALL SELECT '17:30-18:00'as hours, 0 as count UNION ALL 
    SELECT '18:00-18:30'as hours, 0 as count UNION ALL SELECT '18:30-19:00'as hours, 0 as count UNION ALL 
    SELECT '19:00-19:30'as hours, 0 as count UNION ALL SELECT '19:30-20:00'as hours, 0 as count UNION ALL 
    SELECT '20:00-20:30'as hours, 0 as count UNION ALL SELECT '20:30-21:00'as hours, 0 as count UNION ALL 
    SELECT '21:00-21:30'as hours, 0 as count UNION ALL SELECT '21:30-22:00'as hours, 0 as count UNION ALL 
    SELECT '22:00-22:30'as hours, 0 as count UNION ALL SELECT '22:30-23:00'as hours, 0 as count UNION ALL 
    SELECT '23:00-23:30'as hours, 0 as count UNION ALL SELECT '23:30-00:00'as hours, 0 as count 

) AS T
GROUP BY hours ORDER BY hours;

查询中最困难的部分是输出没有任何命中的间隔的统计信息。 SQL就是查询和聚合现有数据;选择或汇总表中缺失的数据是非常不寻常的任务。这就是为什么像Wolph在评论中所说的那样,没有完美的解决方案来完成这项任务。

我通过明确选择当天的所有半间隔来解决这个问题。如果间隔数量有限,则可以使用此解决方案。但是,如果您在很长一段时间内汇总了不同的日期,这将无效。

我不是这个问题的粉丝,但我无法提出更好的建议。使用循环存储过程可以实现更优雅的解决方案,但似乎您希望使用原始SQL查询来解决它。

答案 2 :(得分:1)

  1. 切换到秒。
  2. 算术以获取每个时间单位的数字(在您的情况下使用30*60半小时)
  3. 有一个连续数字表。
  4. 使用LEFT JOIN甚至可以获得缺少的时间单位。
  5. 执行GROUP BY
  6. 从时间单位转换回实际时间 - 用于显示。
  7. (步骤3和4是可选的。问题是&#34;每个&#34;,所以我认为它们是必需的。)

    步骤1和2体现在类似

    的内容中
    FLOOR(UNIX_TIMESTAMP(created_at) / (30*60))
    

    例如:

    mysql> SELECT NOW(), FLOOR(UNIX_TIMESTAMP(NOW()) / (30*60));
    +---------------------+----------------------------------------+
    | NOW()               | FLOOR(UNIX_TIMESTAMP(NOW()) / (30*60)) |
    +---------------------+----------------------------------------+
    | 2018-03-02 08:24:48 |                                 844448 |
    +---------------------+----------------------------------------+
    

    步骤3需要进行一次并保存在永久表中。或者,如果您有MariaDB,请使用&#34; seq&#34;假表;例如,`seq_844448_to_900000会动态地提供一个可以在未来很长的表格。

    第6步示例:

    mysql> SELECT DATE_FORMAT(FROM_UNIXTIME((844448) * 30*60), "%b %d %h:%i");
    +-------------------------------------------------------------+
    | DATE_FORMAT(FROM_UNIXTIME((844448) * 30*60), "%b %d %h:%i") |
    +-------------------------------------------------------------+
    | Mar 02 08:00                                                |
    +-------------------------------------------------------------+
    +---------------------------------------------------------------+
    | DATE_FORMAT(FROM_UNIXTIME((844448+1) * 30*60), "%b %d %h:%i") |
    +---------------------------------------------------------------+
    | Mar 02 08:30                                                  |
    +---------------------------------------------------------------+
    

答案 3 :(得分:0)

您可以添加一些数学来计算48个区间而不是24个区间,并将其放入另一个您要进行分组和排序的区域。

SELECT HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30) as interval48, 
    if(HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30) % 2 =0,
    CONCAT(HOUR(created_at), ':00-', HOUR(created_at), ':30'),
    CONCAT(HOUR(created_at), ':30-', HOUR(created_at)+1, ':00')
       )  as hours,
      count(*)
FROM urls
GROUP BY HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30)
ORDER BY HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30) ASC

结果示例:

0   0:00-0:30   2017
1   0:30-1:00   1959
2   1:30-2:00   1830
3   1:30-2:00   1715
4   2:30-3:00   1679
5   2:30-3:00   1688

Jazerix发布的原始查询结果是:

0:00-1:00 3976
1:00-2:00 3545
2:00-3:00 3367

答案 4 :(得分:0)

另一种方法,无需创建其他表格。可能看起来像黑客: - )

第1步:动态生成时间表

假设:INFORMATION_SCHEMA DB是可用的,并且有一个表COLLATIONS,通常有超过100条记录。您可以使用任何至少有48条记录的表

查询:

id  select_type           table    type      possible_keys                                                                 key                                     key_len    ref                rows    Extra
1   PRIMARY               t1       ref       dp_tech_licence_history_mod_Property                                          dp_tech_licence_history_mod_Property    766        const              4411    Using index condition
1   PRIMARY               t2       eq_ref    PRIMARY                                                                       PRIMARY                                 8          t1.HistoryId       1       
1   PRIMARY               t3       eq_ref    PRIMARY                                                                       PRIMARY                                 8          t2.CollectionId    1       Using where
2   DEPENDENT SUBQUERY    t5       ref       PRIMARY,dp_tech_licence_history_Licence,dp_tech_licence_history_Collection    dp_tech_licence_history_Licence         8          t2.LicenceId       3       Using temporary; Using filesort
2   DEPENDENT SUBQUERY    t4       ref       dp_tech_licence_history_mod_History,dp_tech_licence_history_mod_Property      dp_tech_licence_history_mod_History     8          t5.HistoryId       1       Using where
2   DEPENDENT SUBQUERY    t6       eq_ref    PRIMARY                                                                       PRIMARY                                 8          t5.CollectionId    1       Using where

以上查询将提供一个时间和时间的表格,间隔为30分钟。

第2步:使用第一个查询生成所需结果加入网址表

查询:

SELECT @time fromTime, ADDTIME(@time, '00:29:00')  toTime,  
@time := ADDTIME(@time, '00:30:00') 
FROM information_schema.COLLATIONS
JOIN (SELECT @time := TIME('00:00:00')) a
WHERE @time < '24:00:00'

SQLFiddle

答案 5 :(得分:0)

我希望这适用于,

SELECT 
@sTime:= CONCAT(HOUR(created_at),":",
    (CASE WHEN MINUTE(created_at) > 30 THEN 30 ELSE 0 END)) as intVar,
(CONCAT(
    AddTime(@sTime, '00:00:00'),
    ' to ',
    AddTime(@sTime, '00:30:00')
)) as timeInterval, 
COUNT(*) FROM urls 
GROUP BY 
(CONCAT(HOUR(created_at),":",(CASE WHEN MINUTE(created_at) > 30 THEN 30 ELSE 0 END))) 
ORDER BY HOUR(created_at) ASC