SQL:将大小合并到bin中

时间:2014-08-28 02:27:19

标签: sql sybase-ase

我有一个(Sybase)表,它包含以下信息:

 order_id   int   
 timestamp  datetime   
 action     char(1)      --i=inserted, c=corrected, r=removed   
 shares      int

它跟踪与系统中的订单(由其order_id标识)关联的份额。 举个例子,订单的生命周期如下:

  timestamp action  shares      
  10:00:00  i       1000     -- initial Insert    
  10:06:30  c       900      -- one Change    
  10:07:12  c       800    
  10:50:20  r       800      -- Removal    
  11:10:10  i       600      -- 2nd Insert    
  11:12:10  r       600

在上面的示例中,订单从10:00:00和10:50:20以及从11:10:10和11:12:10再次启用

我在系统中有1000个这样的订单,我需要用直方图绘制在分为5分钟的箱/桶的时间序列中有多少活动份额。 如果给定订单的股票数量在同一个代码heree bin中变化不止一次,我需要平均股票;如上面10:05-10:10中的示例所示,其中1000,900和800可以平均为900。

这是一个更复杂的例子:

1, "20140828 10:00:00",  "i", 1000
1, "20140828 10:06:00",  "c",  900
1, "20140828 10:07:12",  "c",  500
1, "20140828 10:10:10",  "c",  400
1, "20140828 10:20:20",  "r",  400
1, "20140828 10:30:10",  "i",  300
1, "20140828 10:32:10",  "r",  300

2, "20140828 09:51:00",  "i",  500
2, "20140828 10:08:30",  "r",  500

3, "20140828 10:10:00",  "i", 1000
3, "20140828 10:11:20",  "r", 1000

预期输出:

10:00:00 1500
10:05:00 1300
10:10:00 1450
10:15:00 400
10:20:00 400
10:25:00 0
10:30:00 300
10:35:00 0
10:40:00 0
10:45:00 0
10:50:00 0
10:55:00 0

提前感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

这是Running Sum problem in SQL Server(由于共享历史记录而导致的MS或Sybase)的变体,按存储区ID分组,这可能只是以基本时间整数为单位的分钟时差 - 将划分为5.所以这样的事情会做:

create table #t(
    BucketNo    int not null primary key clustered,
    Activity    int not null,
    Active      int not null
);

-- pre-aggregate activity data
-- assumes prior existence of a zero-based NUMBERS or TALLY table
insert #t(BucketNo,Activity,Active) 
select 
     N
    ,isnull(Activity,0)
    ,0
from NUMBERS 
left join (
    select
         datediff(mm,0,TimeStamp) / 5 as BucketNo
        ,case action when 'i' then +1
                          'r' then -1
         end * shares          as Activity
        ,0 as Active  
    from  ActivityTable
    where action <> 'c'
    group by            datediff(mm,0,TimeStamp) / 5 

    union all

    select
         datediff(mm,0,TimeStamp) / 5 as BucketNo
        ,case action when 'i' then +1
                          'r' then -1
         end * shares
         - (  select top 1 i.shares 
              from ActivityTable i
              where i.order_id = c.order_id  and  i.TimeStamp > c.TimeStamp
              order by i.TimeStamp desc
           ) as Activity
        ,0 as Active  
    from ActivityTable as c
    where c.action  = 'c         
    group by            datediff(mm,0,TimeStamp) / 5 
) data on data.BucketNo = N
where N < 24 * 12; -- 5 minute buckets per day

现在我们使用SQL Server quirky update 以聚簇索引顺序处理#t以执行运行总和。

declare @Shares   int = 0,
        @BucketNo int = 0;

-- `quirky update` peculiar to SQL Server
update #t
   set @Shares = Shares 
               = case when BucketNo = @BucketNo
                      then @Shares + Activity
                      else 0
                 end,
       @BucketNo = BucketNo
from #t with (TABLOCKX) -- not strictly necessary when using a temp table.
option (MAXDOP 1);      -- prevent parallelization of query

select BucketNo, Active from #t order by BucketNo
go