按重复列分组

时间:2018-04-24 20:24:59

标签: sql postgresql

我无法将这个问题写成文字,这可能就是为什么我找不到一个例子,所以这就是我想做的事情。

我有一张像这样的表

    | counter|      timestamp      |
    |   1    | 2018-01-01T11:11:01 |
    |   1    | 2018-01-01T11:11:02 |
    |   1    | 2018-01-01T11:11:03 |
    |   2    | 2018-01-01T11:11:04 |
    |   2    | 2018-01-01T11:11:05 |
    |   3    | 2018-01-01T11:11:06 |
    |   3    | 2018-01-01T11:11:07 |
    |   1    | 2018-01-01T11:11:08 |
    |   1    | 2018-01-01T11:11:09 |
    |   1    | 2018-01-01T11:11:10 |

我想要做的是按每组计数器进行分组,这样如果我进行查询

SELECT counter, MAX(timestamp) as st, MIN(timestamp) as et 
FROM table 
GROUP BY counter;

结果将是

    | counter |          st         |         et          |
    |   1     | 2018-01-01T11:11:01 | 2018-01-01T11:11:03 |
    |   2     | 2018-01-01T11:11:04 | 2018-01-01T11:11:05 |
    |   3     | 2018-01-01T11:11:06 | 2018-01-01T11:11:07 |
    |   1     | 2018-01-01T11:11:08 | 2018-01-01T11:11:10 |

而不是

实际发生的事情
    | counter |          st         |         et          |
    |   1     | 2018-01-01T11:11:01 | 2018-01-01T11:11:10 |
    |   2     | 2018-01-01T11:11:04 | 2018-01-01T11:11:05 |
    |   3     | 2018-01-01T11:11:06 | 2018-01-01T11:11:07 |

所以我想要在没有嵌套查询的情况下理想地将group by和partition组合起来的东西

3 个答案:

答案 0 :(得分:3)

您必须指定具有相同重复值计数器的组。这可以使用两个窗函数lag()和累计sum()

来完成
select counter, min(timestamp) as st, max(timestamp) as et
from (
    select counter, timestamp, sum(grp) over w as grp
    from (
        select *, (lag(counter, 1, 0) over w <> counter)::int as grp
        from my_table
        window w as (order by timestamp)
        ) s
    window w as (order by timestamp)
    ) s
group by counter, grp
order by st

DbFiddle.

答案 1 :(得分:1)

您应该计算一个新组:

create table tbl(counter int, ts timestamp);
insert into tbl values
    (1, '2018-01-01T11:11:01'),
    (1, '2018-01-01T11:11:02'),
    (1, '2018-01-01T11:11:03'),
    (2, '2018-01-01T11:11:04'),
    (2, '2018-01-01T11:11:05'),
    (3, '2018-01-01T11:11:06'),
    (3, '2018-01-01T11:11:07'),
    (1, '2018-01-01T11:11:08'),
    (1, '2018-01-01T11:11:09'),
    (1, '2018-01-01T11:11:10');
✓

10 rows affected
select min(counter) as counter, min(ts) as st, max(ts) as et
from
(
    select counter, ts, sum(rst) over (order by ts) as grp
    from 
         (
         select counter, ts,
                case when coalesce(lag(counter) over (order by ts), -1) <> counter then 1 end rst
         from   tbl
         ) t1
) t2
group by grp
counter | st                  | et                 
------: | :------------------ | :------------------
      3 | 2018-01-01 11:11:06 | 2018-01-01 11:11:07
      1 | 2018-01-01 11:11:08 | 2018-01-01 11:11:10
      2 | 2018-01-01 11:11:04 | 2018-01-01 11:11:05
      1 | 2018-01-01 11:11:01 | 2018-01-01 11:11:03

db&lt;&gt;小提琴here

答案 2 :(得分:1)

您可以使用排名功能

select counter, min(timestamp) st, max(timestamp) et
from (select *, 
               row_number() over (order by timestamp) Seq1,
               row_number() over (partition by counter order by timestamp) Seq2 
      from table 
     ) t
group by counter, (Seq1-Seq2);

这将使用两个排名函数(Seq1-Seq2)的差异,并在GROUP BY子句中使用它们。