在有序行集中计算相等的连续值

时间:2016-12-30 19:39:16

标签: sql postgresql window-functions gaps-and-islands postgresql-9.6

我有一个包含两列的表格,如:

CREATE TABLE actions (
  action_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  "action" text NOT NULL
);

及其中的以下数据:

        action_time         | action 
----------------------------+--------
 2016-12-30 14:12:33.353269 | a
 2016-12-30 14:12:38.536818 | b
 2016-12-30 14:12:43.305001 | a
 2016-12-30 14:12:49.432981 | a
 2016-12-30 14:12:53.536397 | b
 2016-12-30 14:12:57.449101 | b
 2016-12-30 14:13:01.592785 | a
 2016-12-30 14:13:06.192907 | b
 2016-12-30 14:13:11.249181 | b
 2016-12-30 14:13:13.690897 | b
(10 rows)

您可以假设 action_time 列中没有重复值。

如何计算从上一个操作开始的一行中相同操作的数量?

一行中相同操作的数量没有限制,任何操作都可以是最后一个。此外,对不同操作的种类没有限制:我只使用了两个来简化示例数据。

对于这个示例数据,我希望结果为3.这是因为最后一个动作是“b”并且它连续发生了3次。

我认为可以通过组合窗口函数和WITH RECURSIVE子句来实现解决方案,但我不知道如何做到这一点。

4 个答案:

答案 0 :(得分:1)

我为经典的gap-and-islands解决方案添加了一点点扭曲 注意ROW_NUMBER函数如何使用降序ORDER BY。

select  count(*)

from   (select  

            action
           ,row_number() over (                    order by action_time desc) as rn
           ,row_number() over (partition by action order by action_time desc) as rn_action

        from    mytab
        ) t

group by action
        ,rn - rn_action

having   min(rn) = 1

答案 1 :(得分:1)

这应该这样做。

SELECT COUNT(*)
FROM actions
WHERE action_time > (
SELECT action_time
  FROM actions 
  WHERE action <> (SELECT action FROM actions ORDER BY action_time DESC LIMIT 1) 
ORDER BY action_time DESC LIMIT 1);

最内层的查询

SELECT action FROM actions ORDER BY action_time DESC LIMIT 1

确定最后一个动作。

查询

SELECT action_time
  FROM actions 
  WHERE action <> (SELECT action FROM actions ORDER BY action_time DESC LIMIT 1) 
ORDER BY action_time DESC LIMIT 1

找到具有不同操作的最后一行。

最外面的查询查找该行之后的所有行。

答案 2 :(得分:1)

改进的解决方案

select  count(*)

from   (select  

            action
           ,row_number() over (                    order by action_time desc) as rn
           ,row_number() over (partition by action order by action_time desc) as rn_action

        from    mytab
        ) t

where   rn = rn_action

答案 3 :(得分:0)

我想到了这一点:

select count(*)
from t cross join
     (select t2.action
      from t t2
      order by action_time desc
      limit 1
     ) last
where t.action_time >= (select max(t2.action_time)
                        from t t2
                        where t2.action <> last.action
                       );

这应该能够利用(action_time, action)上的索引。