分组连续的行

时间:2014-08-02 08:07:14

标签: sql database sqlite

我有下表叫Phases。它描述了DRAM操作:

Transact    PhaseName   TransactionBegin
1           REQ         0   
1           RESP        25  
2           REQ         5
2           RESP        30
10          REQ         50
10          RESP        105
11          REQ         55
11          RESP        115
21          REQ         60
21          RESP        120
22          REQ         65
22          RESP        125
23          REQ         70
23          RESP        130
24          REQ         75
24          RESP        140
37          REQ         200
37          RESP        240
38          REQ         205
38          RESP        245
...

我需要找到每个组的第一个REQ和最后一个RESP之间的时间。一个组是所有Transact连续的地方。

TransactGroup   Period
(1..2)          30
(10..11)        65
(21..24)        80
(37..38)        45

如果我能找到以下期间的平均值,那么我会很高兴:1)所有计算2笔交易的组,2)所有计算6笔交易的组。

3 个答案:

答案 0 :(得分:3)

我会采用不同的方法。首先,我将按TransAct汇总这些组,并添加一个枚举列。此列与Transact之间的差异提供了您正在寻找的分组:

with p as (
      select Transact,
             max(case when PhaseName = 'REQ' then TransactionBegin end) as req,
             max(case when PhaseName = 'RESP' then TransactionBegin end) as resp
      from phases
      group by Transact
     ),
     pn as (
      select pn.*, (select count(*) from p p2 where p2.Transact <= p.Transact) as seqnum
      from p
     )
select min(Transact), max(Transact), max(resp) - min(resp)
from pn
group by (Transact - seqnum);

编辑:

如果没有with子句,查询会失去一点优雅。这是它的样子:

select min(Transact), max(Transact), max(resp) - min(resp)
from (select pn.*,
             (select count(distinct p2.Transact)
              from phases p2
              where p2.Transact <= p.Transact
             ) as seqnum
      from (select Transact,
                   max(case when PhaseName = 'REQ' then TransactionBegin end) as req,
                   max(case when PhaseName = 'RESP' then TransactionBegin end) as resp
            from phases p
            group by Transact
           ) p
     ) p
group by (Transact - seqnum);

请注意,我稍微更改了子查询以使用count(distinct)。子查询现在在主表上运行,它需要计算不同的ID而不是所有行以获得正确的枚举。

答案 1 :(得分:0)

可以动态计算事务组,但这会使查询非常复杂。 最好将其添加为新列:

ALTER TABLE Phases
ADD COLUMN TransactGroup;

UPDATE Phases
SET TransactGroup = (SELECT Transact
                     FROM Phases AS First
                     WHERE First.Transact <= Phases.Transact
                       AND NOT EXISTS (SELECT 1
                                       FROM Phases AS Previous
                                       WHERE Transact = First.Transact - 1)
                     ORDER BY Transact DESC
                     LIMIT 1)

作为群组识别者,我们使用群组中的第一个Transact。 如果没有包含前一个Transact数字的行,则行是组中的第一行。 要从某个任意行中查找该组的第一行,我们会搜索第一行的最新行,但不在此行之后


然后可以使用简单的GROUP BY来完成查询(CASE expressions使不需要的值为NULL,MIN / MAX会忽略这些值:

SELECT TransactGroup,
       MAX(CASE PhaseName WHEN 'RESP' THEN TransactionBegin END) -
       MIN(CASE PhaseName WHEN 'REQ'  THEN TransactionBegin END) AS Period,
       MAX(Transact) - MIN(Transact) + 1 AS TransactCount
FROM Phases
GROUP BY TransactGroup
SELECT TransactCount,
       AVG(Period)
FROM (... the previous query ...)
WHERE TransactCount IN (2, 6)
GROUP BY TransactCount

答案 2 :(得分:0)

只要较低组的resp开始低于后续组的resp开始,这应该有效,根据示例似乎是这种情况:

select
    t.transact groupstart, 
    min(tend) groupend, 
    min(respend)-transactionBegin Period
from t
join 
    (
    select transact tend, transactionbegin respend from t 
    where t.phasename='RESP' 
    and not exists 
    (select 1 from t t1 where t1.transact=t.transact+1) 
    ) t2 
on t.transact<t2.tend 
where t.phasename='REQ'
and not exists
(select 1 from t t1 where t1.transact=t.transact-1) 
group by transact
是你的桌子; t1和t2是子查询中的别名

SQLFiddle

将此输出作为子查询,计数和平均值将是一个简单的SQL。