计算NULL值组 - 分区还是窗口?

时间:2016-11-23 19:41:53

标签: sql postgresql gaps-and-islands

  n | g
 ---------
  1 | 1
  2 | NULL
  3 | 1
  4 | 1
  5 | 1
  6 | 1
  7 | NULL
  8 | NULL
  9 | NULL
 10 | 1
 11 | 1
 12 | 1
 13 | 1
 14 | 1
 15 | 1
 16 | 1
 17 | NULL
 18 | 1
 19 | 1
 20 | 1
 21 | NULL
 22 | 1
 23 | 1
 24 | 1
 25 | 1
 26 | NULL
 27 | NULL
 28 | 1
 29 | 1
 30 | NULL
 31 | 1

从上面的专栏g我得到这个结果:

 x|y
 ---
 1|4
 2|1
 3|1

,其中

x代表连续NULL和
的计数 y代表一组NULL发生的时间。

即,...有 4组只有1个NULL,
1组2个NULL和
1组3个NULL

2 个答案:

答案 0 :(得分:1)

使用窗口函数计算非空值的运行计数以形成组,然后2两个嵌套计数......

access_token

SELECT x, count(*) AS y FROM ( SELECT grp, count(*) FILTER (WHERE g IS NULL) AS x FROM ( SELECT g, count(g) OVER (ORDER BY n) AS grp FROM tbl ) sub1 WHERE g IS NULL GROUP BY grp ) sub2 GROUP BY 1 ORDER BY 1; 仅计算非空值。

这包括前一行中NULL值为count()的空值g - 必须从计数中删除。

我在grp的初始查询中替换了我HAVING的{​​{1}}条款,如@klin uses in his answer),这更简单。

相关:

如果 WHERE g IS NULL无间隙整数序列,您可以进一步简化:

n

立即消除非空值并从SELECT x, count(*) AS y FROM ( SELECT grp, count(*) AS x FROM ( SELECT n - row_number() OVER (ORDER BY n) AS grp FROM tbl WHERE g IS NULL ) sub1 GROUP BY 1 ) sub2 GROUP BY 1 ORDER BY 1; 中扣除行号,从而直接到达(无意义的)组号...

虽然n中唯一可能的值是g,但1是一个聪明的技巧(like @klin provided)。但那应该是sum()列,那么作为数字类型是不合理的。所以我认为这只是对问题中实际问题的简化。

答案 1 :(得分:0)

select x, count(x) y
from (
    select s, count(s) x
    from ( 
        select *, sum(g) over (order by i) as s
        from example
        ) s
    where g isnull
    group by 1
    ) s
group by 1
order by 1;

Test it here