我有一个(Sybase)表,它包含以下信息:
order_id int
timestamp datetime
action char(1) --i=inserted, c=corrected, r=removed
shares int
它跟踪与系统中的订单(由其order_id标识)关联的份额。 举个例子,订单的生命周期如下:
timestamp action shares
10:00:00 i 1000 -- initial Insert
10:06:30 c 900 -- one Change
10:07:12 c 800
10:50:20 r 800 -- Removal
11:10:10 i 600 -- 2nd Insert
11:12:10 r 600
在上面的示例中,订单从10:00:00和10:50:20以及从11:10:10和11:12:10再次启用
我在系统中有1000个这样的订单,我需要用直方图绘制在分为5分钟的箱/桶的时间序列中有多少活动份额。 如果给定订单的股票数量在同一个代码heree bin中变化不止一次,我需要平均股票;如上面10:05-10:10中的示例所示,其中1000,900和800可以平均为900。
这是一个更复杂的例子:
1, "20140828 10:00:00", "i", 1000
1, "20140828 10:06:00", "c", 900
1, "20140828 10:07:12", "c", 500
1, "20140828 10:10:10", "c", 400
1, "20140828 10:20:20", "r", 400
1, "20140828 10:30:10", "i", 300
1, "20140828 10:32:10", "r", 300
2, "20140828 09:51:00", "i", 500
2, "20140828 10:08:30", "r", 500
3, "20140828 10:10:00", "i", 1000
3, "20140828 10:11:20", "r", 1000
预期输出:
10:00:00 1500
10:05:00 1300
10:10:00 1450
10:15:00 400
10:20:00 400
10:25:00 0
10:30:00 300
10:35:00 0
10:40:00 0
10:45:00 0
10:50:00 0
10:55:00 0
提前感谢您的帮助。
答案 0 :(得分:0)
这是Running Sum problem in SQL Server(由于共享历史记录而导致的MS或Sybase)的变体,按存储区ID分组,这可能只是以基本时间整数为单位的分钟时差 - 将划分为5.所以这样的事情会做:
create table #t(
BucketNo int not null primary key clustered,
Activity int not null,
Active int not null
);
-- pre-aggregate activity data
-- assumes prior existence of a zero-based NUMBERS or TALLY table
insert #t(BucketNo,Activity,Active)
select
N
,isnull(Activity,0)
,0
from NUMBERS
left join (
select
datediff(mm,0,TimeStamp) / 5 as BucketNo
,case action when 'i' then +1
'r' then -1
end * shares as Activity
,0 as Active
from ActivityTable
where action <> 'c'
group by datediff(mm,0,TimeStamp) / 5
union all
select
datediff(mm,0,TimeStamp) / 5 as BucketNo
,case action when 'i' then +1
'r' then -1
end * shares
- ( select top 1 i.shares
from ActivityTable i
where i.order_id = c.order_id and i.TimeStamp > c.TimeStamp
order by i.TimeStamp desc
) as Activity
,0 as Active
from ActivityTable as c
where c.action = 'c
group by datediff(mm,0,TimeStamp) / 5
) data on data.BucketNo = N
where N < 24 * 12; -- 5 minute buckets per day
现在我们使用SQL Server quirky update 以聚簇索引顺序处理#t以执行运行总和。
declare @Shares int = 0,
@BucketNo int = 0;
-- `quirky update` peculiar to SQL Server
update #t
set @Shares = Shares
= case when BucketNo = @BucketNo
then @Shares + Activity
else 0
end,
@BucketNo = BucketNo
from #t with (TABLOCKX) -- not strictly necessary when using a temp table.
option (MAXDOP 1); -- prevent parallelization of query
select BucketNo, Active from #t order by BucketNo
go