我在数据库列中有一个时间列表(表示访问网站)。
我需要按时间间隔对它们进行分组,然后获得这些日期的“累积频率”表。
例如我可能有:
9:01
9:04
9:11
9:13
9:22
9:24
9:28
我希望将其转换为
9:05 - 2
9:15 - 4
9:25 - 6
9:30 - 7
我该怎么做?我甚至可以在SQL中轻松实现这一点吗?我可以很容易地用C#
来做答案 0 :(得分:8)
create table accu_times (time_val datetime not null, constraint pk_accu_times primary key (time_val));
go
insert into accu_times values ('9:01');
insert into accu_times values ('9:05');
insert into accu_times values ('9:11');
insert into accu_times values ('9:13');
insert into accu_times values ('9:22');
insert into accu_times values ('9:24');
insert into accu_times values ('9:28');
go
select rounded_time,
(
select count(*)
from accu_times as at2
where at2.time_val <= rt.rounded_time
) as accu_count
from (
select distinct
dateadd(minute, round((datepart(minute, at.time_val) + 2)*2, -1)/2,
dateadd(hour, datepart(hour, at.time_val), 0)
) as rounded_time
from accu_times as at
) as rt
go
drop table accu_times
结果:
rounded_time accu_count
----------------------- -----------
1900-01-01 09:05:00.000 2
1900-01-01 09:15:00.000 4
1900-01-01 09:25:00.000 6
1900-01-01 09:30:00.000 7
答案 1 :(得分:3)
我应该指出,基于问题的陈述“意图”,对访客流量进行分析 - 我写了这个陈述来总结统一群体中的计数。
否则(如“示例”组中)将比较5分钟间隔内的计数与10分钟间隔内的计数 - 这是没有意义的。
你必须了解用户要求的“意图”,而不是文字的“阅读”。 : - )
create table #myDates
(
myDate datetime
);
go
insert into #myDates values ('10/02/2008 09:01:23');
insert into #myDates values ('10/02/2008 09:03:23');
insert into #myDates values ('10/02/2008 09:05:23');
insert into #myDates values ('10/02/2008 09:07:23');
insert into #myDates values ('10/02/2008 09:11:23');
insert into #myDates values ('10/02/2008 09:14:23');
insert into #myDates values ('10/02/2008 09:19:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:26:23');
insert into #myDates values ('10/02/2008 09:27:23');
insert into #myDates values ('10/02/2008 09:29:23');
go
declare @interval int;
set @interval = 10;
select
convert(varchar(5), dateadd(minute,@interval - datepart(minute, myDate) % @interval, myDate), 108) timeGroup,
count(*)
from
#myDates
group by
convert(varchar(5), dateadd(minute,@interval - datepart(minute, myDate) % @interval, myDate), 108)
retuns:
timeGroup
--------- -----------
09:10 4
09:20 3
09:30 8
答案 2 :(得分:2)
标准化为秒,除以您的桶间隔,截断和重新加载:
select sec_to_time(floor(time_to_sec(d)/300)*300), count(*)
from d
group by sec_to_time(floor(time_to_sec(d)/300)*300)
使用Ron Savage的数据,我得到了
+----------+----------+
| i | count(*) |
+----------+----------+
| 09:00:00 | 1 |
| 09:05:00 | 3 |
| 09:10:00 | 1 |
| 09:15:00 | 1 |
| 09:20:00 | 6 |
| 09:25:00 | 2 |
| 09:30:00 | 1 |
+----------+----------+
您可能希望使用ceil()或round()而不是floor()。
更新:对于使用
创建的表格create table d (
d datetime
);
答案 3 :(得分:1)
创建一个表格periods
,描述您希望将这一天划分为的日期。
SELECT periods.name, count(time)
FROM periods, times
WHERE period.start <= times.time
AND times.time < period.end
GROUP BY periods.name
答案 4 :(得分:1)
创建一个表,其中包含您希望获得总计的间隔,然后将两个表连接在一起。
如:
time_entry.time_entry
-----------------------
2008-10-02 09:01:00.000
2008-10-02 09:04:00.000
2008-10-02 09:11:00.000
2008-10-02 09:13:00.000
2008-10-02 09:22:00.000
2008-10-02 09:24:00.000
2008-10-02 09:28:00.000
time_interval.time_end
-----------------------
2008-10-02 09:05:00.000
2008-10-02 09:15:00.000
2008-10-02 09:25:00.000
2008-10-02 09:30:00.000
SELECT
ti.time_end,
COUNT(*) AS 'interval_total'
FROM time_interval ti
INNER JOIN time_entry te
ON te.time_entry < ti.time_end
GROUP BY ti.time_end;
time_end interval_total
----------------------- -------------
2008-10-02 09:05:00.000 2
2008-10-02 09:15:00.000 4
2008-10-02 09:25:00.000 6
2008-10-02 09:30:00.000 7
如果不想要累积总计,而是想要在一个范围内的总计,那么您可以将time_start列添加到time_interval表并将查询更改为
SELECT
ti.time_end,
COUNT(*) AS 'interval_total'
FROM time_interval ti
INNER JOIN time_entry te
ON te.time_entry >= ti.time_start
AND te.time_entry < ti.time_end
GROUP BY ti.time_end;
答案 5 :(得分:0)
这使用了很多SQL技巧(SQL Server 2005):
CREATE TABLE [dbo].[stackoverflow_165571](
[visit] [datetime] NOT NULL
) ON [PRIMARY]
GO
;WITH buckets AS (
SELECT dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0) AS visit_bucket
,COUNT(*) AS visit_count
FROM stackoverflow_165571
GROUP BY dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0)
)
SELECT LEFT(CONVERT(varchar, l.visit_bucket, 8), 5) + ' - ' + CONVERT(varchar, SUM(r.visit_count))
FROM buckets l
LEFT JOIN buckets r
ON r.visit_bucket <= l.visit_bucket
GROUP BY l.visit_bucket
ORDER BY l.visit_bucket
请注意,它会将所有时间放在同一天,并假设它们位于日期时间列中。它唯一没有做的就是从时间表示中去除前导零。