如何用PostgreSQL中的先前值填充Null?

时间:2019-03-18 12:29:09

标签: postgresql window-functions

我有一个包含Null值的表。我需要将它们替换为以前的非null值。 这是我拥有的数据的示例:

   date    | category | start_period | period_number |
------------------------------------------------------
2018-01-01 |    A     |       1      |       1       |
2018-01-02 |    A     |       0      |      Null     |
2018-01-03 |    A     |       0      |      Null     |
2018-01-04 |    A     |       0      |      Null     |
2018-01-05 |    B     |       1      |       2       |
2018-01-06 |    B     |       0      |      Null     |
2018-01-07 |    B     |       0      |      Null     |
2018-01-08 |    A     |       1      |       3       |
2018-01-09 |    A     |       0      |      Null     |
2018-01-10 |    A     |       0      |      Null     |

结果应如下所示:

   date    | category | start_period | period_number |
------------------------------------------------------
2018-01-01 |    A     |       1      |       1       |
2018-01-02 |    A     |       0      |       1       |
2018-01-03 |    A     |       0      |       1       |
2018-01-04 |    A     |       0      |       1       |
2018-01-05 |    B     |       1      |       2       |
2018-01-06 |    B     |       0      |       2       |
2018-01-07 |    B     |       0      |       2       |
2018-01-08 |    A     |       1      |       3       |
2018-01-09 |    A     |       0      |       3       |
2018-01-10 |    A     |       0      |       3       |

我尝试了以下查询,但是在这种情况下,仅第一个Null值将被替换。

select 
date,
category,
start_period,
case
    when period_number isnull then lag(period_number) over()
    else period_number
end as period_number
from period_table;

此外,我尝试使用first_value()窗口功能,但我不知道如何设置正确的窗口。

我们非常感谢您的帮助。

3 个答案:

答案 0 :(得分:0)

如果您将case语句替换为:

(
    select
        _.period_number
    from
        period_table as _
    where
        _.period_number is not null
        and _.category = period_table.category
        and _.date <= period_table.date
    order by
        _.date desc
    limit 1
) as period_number

然后它将具有预期的效果。它远没有窗口函数那么优雅,但我认为窗口函数在这里对于您的特定用例没有足够的灵活性(或者,至少,如果它们是,我不知道如何灵活调整它们)

答案 1 :(得分:0)

您可以将表与其自身连接并获得所需的值。假设您的日期列是主键或唯一键。

update your_table upd set period_number = tbl.period_number 
from
(
   select b.date, max(b2.date) as d2 from your_table b 
   inner join d_batch_tab b2 on b2.date< b.date and b2.period_number  is not null 
   group by b.date
)t 
inner join your_table tbl on tbl.date = t.d2
where t.date= upd.date

如果您不需要更新表,而只需更新一条选择语句,则

select yt.date, yt.category, yt.start_period, tbl.period_number
from your_table yt
inner join 
(
   select b.date, max(b2.date) as d2 from your_table b 
   inner join d_batch_tab b2 on b2.date< b.date and b2.period_number  is not null 
   group by b.date
)t on yt.date = t.date
inner join your_table tbl on tbl.date = t.d2

答案 2 :(得分:0)

enter image description here

windows函数和frame子句的例子:

    select 
    date,category,score
    ,FIRST_VALUE(score) OVER (
        PARTITION BY category
        ORDER BY date RANGE BETWEEN UNBOUNDED 
        PRECEDING AND CURRENT ROW
    ) as last_score
from testing.rec_test
order by date, category


select 
    date,category,score
    ,LAST_VALUE(score) OVER (
        PARTITION BY category
        ORDER BY date RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
    ) as last_score
from testing.rec_test
order by date, category