我有一个名为MY_TABLE(例如)的MS / Azure SQL表,其中包含两列:
此表表示时间序列,通常为1分钟。数据中可能存在一些差距/缺失值。 SQL数据库链接到Django Web应用程序,其中时间序列将显示在图表上。
我使用以下查询以两小时之间的1小时分辨率检索聚合数据:
WITH time_table as (
SELECT cast(dateadd(second, datetime, '19700101') as DATETIME) as calendar_date,val,datetime as epoch
FROM MY_TABLE
WHERE datetime>=1451628000 and datetime<1452755200
)
SELECT min(epoch),avg(val)
FROM time_table
GROUP BY YEAR(calendar_date),MONTH(calendar_date),DAY(calendar_date),datepart(hour,calendar_date)
ORDER BY min(calendar_date) ASC
此查询返回在1小时内聚合的平均值。
问题
我可以使用Python / Pandas轻松执行此操作,但我觉得不会优化
答案 0 :(得分:0)
您可以使用Epoch表单和日期时间表生成一个小时表:
create table dbo.Hours (
EpochStart bigint not null primary key clustered
, EpochEnd bigint not null
, Datetime_Hour datetime not null
, Datetime_NextHour datetime not null
);
declare @fromdate datetime = '20160101';
declare @thrudate datetime = '20161231';
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, Hours as (
select top ((datediff(day, @fromdate,@thrudate)+1)*24)
[Datetime_Hour]=convert(datetime,dateadd(hour,row_number() over(order by (select 1))-1,@fromdate))
,[Datetime_NextHour]=convert(datetime,dateadd(hour,row_number() over(order by (select 1)),@fromdate))
from n as deka cross join n as hecto cross join n as kilo
cross join n as tenK cross join n as hundredK
order by 1
)
insert into dbo.Hours
select
EpochStart = datediff(second,'19700101',Datetime_Hour)
, EpochEnd = datediff(second,'19700101',Datetime_NextHour)
, Datetime_Hour
, Datetime_NextHour
from Hours;
然后在您的查询中使用它,如下所示:
select
h.EpochStart
, avg(val)
from dbo.Hours h
left join MY_TABLE t
on t.datetime >= h.EpochStart
and t.datetime < h.EpochEnd
where h.EpochStart >= 1451628000
and h.EpochStart < 1452755200
group by h.EpochStart
rextester演示:http://rextester.com/INZW16141