使用last not-null值更新有序行

时间:2015-08-26 15:03:11

标签: sql postgresql

考虑一个类似于以下数据的表

column_a (boolean) | column_order (integer)
TRUE               |     1
NULL               |     2
NULL               |     3
TRUE               |     4
NULL               |     5
FALSE              |     6
NULL               |     7

我想编写一个查询,将NULL中的每个column_a值替换为列的先前值中的最后一个非NULL值,具体取决于指定的顺序。 column_order结果如下:

column_a (boolean) | column_order (integer)
TRUE               |     1
TRUE               |     2
TRUE               |     3
TRUE               |     4
TRUE               |     5
FALSE              |     6
FALSE              |     7

为简单起见,我们可以假设第一个值永远不为空。如果连续NULL个值不超过一个,则以下情况有效:

SELECT
  COALESCE(column_a, lag(column_a) OVER (ORDER BY column_order))
FROM test_table
ORDER BY column_order;

但是,上述内容不适用于任意数量的连续NULL值。什么是能够实现上述结果的Postgres查询?是否存在可以很好地扩展到大量行的高效查询?

3 个答案:

答案 0 :(得分:3)

您可以使用sum超过case的方便技巧,根据null和非null系列之间的划分创建分区,然后first_value将它们转发。< / p>

e.g。

select
  *,
  sum(case when column_a is not null then 1 else 0 end)
    OVER (order by column_order) as partition
from table1;

 column_a | column_order | partition 
----------+--------------+-----------
 t        |            1 |         1
          |            2 |         1
          |            3 |         1
 t        |            4 |         2
          |            5 |         2
 f        |            6 |         3
          |            7 |         3
(7 rows)

然后

select
  first_value(column_a)
    OVER (PARTITION BY partition ORDER BY column_order),
  column_order
from (
    select
      *,
      sum(case when column_a is not null then 1 else 0 end)
        OVER (order by column_order) as partition
    from table1
) partitioned;

给你:

 first_value | column_order 
-------------+--------------
 t           |            1
 t           |            2
 t           |            3
 t           |            4
 t           |            5
 f           |            6
 f           |            7
(7 rows)

答案 1 :(得分:1)

不确定Postgresql是否支持此功能,但请尝试一下:

SELECT
  COALESCE(column_a, (select t2.column_a from test_table t2
                      where t2.column_order < t1.column_order
                        and t2.column_a is not null
                      order by t2.column_order desc
                      fetch first 1 row only))
FROM test_table t1
ORDER BY column_order;

答案 2 :(得分:1)

我对SqlServer更熟悉,但这应该做你需要的。

update  tableA as a2
set column_a = b2.column_a
from (
  select a.column_order, max(b.column_order) from tableA as a
  inner join tableA as b on a.column_order > b.column_order and b.column_a is not null
  where a.column_a is null
  group by a.column_order
) as junx 
inner join tableA as b2 on junx.max =b2.column_order
where a2.column_order = junx.column_order

SQL Fiddle