在条件下将两行和所有行分组

时间:2014-08-19 08:42:30

标签: sql postgresql group-by window-functions

通过以下查询定义的表格提供了如下面的屏幕截图所示的输出:

select
id,
value,
case when value = 'foo'
and random() <= 0.5
then 't' else null
end as to_group

from

(select
generate_series(1,100) as id,
case when random() <= 0.2 then 'foo' else 'bar' end as value
)t1

enter image description here

如何将标有'foo''t'的所有行与前面的'foo'行(无论是否有to_group = 't')和所有封闭的{{'bar'进行分组1}}行?
在给定的示例中,这些行是33-37。

2 个答案:

答案 0 :(得分:1)

特别困难的是,我们需要在 两个 组中使用('foo', TRUE)的一些行 - 在一个结尾和下一个结束时。所以我们必须添加其他实例。

为了简化理解,并使用一组稳定的行,我将您的示例值放入临时表中:

CREATE TEMP TABLE t AS
SELECT id, value
      ,CASE WHEN value = 'foo' AND random() < 0.5
         THEN TRUE ELSE null
       END AS to_group
FROM  (
   SELECT id, CASE WHEN random() < 0.2 THEN 'foo' ELSE 'bar' END AS value
   FROM   generate_series(1,100) id
   ) sub;

使用标记的boolean数据类型使其更简单 然后我的查询被简化为:

WITH cte AS (
   SELECT *, count(value = 'foo' OR NULL) OVER (ORDER BY id) AS grp
   FROM  t
   )
SELECT grp, min(id) AS min_id, max(id) AS max_id
FROM  (
   SELECT id, value, to_group, grp     FROM cte
   UNION  ALL
   SELECT id, value, to_group, grp - 1 FROM cte WHERE to_group
   ) sub
GROUP  BY grp
HAVING count(value = 'foo' OR NULL) = 2
ORDER  BY grp;

解释

  • 在CTE cte中,我使用grp添加了行value = 'foo'的运行计数。中间的其他行获得相同的数字:

  • 如上所述,特殊的困难是我们需要一些行两次。因此,我们在子查询UNION ALL中添加了另一个sub的实例。在此期间,我将副本的grp减少1,以便您的小组现已完成。

  • 最终的SELECT现在只能GROUP BY grp 有效组的两个行包含value = 'foo'

答案 1 :(得分:0)

以下解决方案完全符合我的要求:

create temp table t as
 select
            id,
            value,
            case when value = 'foo'
            and random() <= 0.5
            then 't' else null
            end as to_group

            from

            (select
            generate_series(1,100) as id,
            case when random() <= 0.2 then 'foo' else 'bar' end as value
            )t1;



SELECT array_agg(id) from

    (SELECT *, sum(group_flag) over (ORDER BY id) AS group_nr FROM 

        (select *,
    case WHEN (to_group = 't' and value = 'foo')
    or (next_to_group = 't' and value = 'bar') THEN NULL
    ELSE '1'::integer END AS group_flag
    from(
            select distinct on (id) id, value, to_group, foo_id, next_to_group
            from (
                select * from t
            left join
            (select id as foo_id, to_group as next_to_group from t where value = 'foo')n
            on
            n.foo_id > t.id
            order by id, foo_id
            )t1
    )t2

    )t3
    )t4 group by group_nr